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(54) Method and apparatus for inhibiting reproduction of parts of a recording 



(57) An image processing apparatus for reproduc- 
ing a recorded digital data stream can detect a time in- 
formation or emergency news object contained in the 
recorded digital data stream, and inhibit display of the 
time information or emergency news which is insignifi- 
cant in reproducing the recorded image. Alternatively, 
the image processing apparatus displays the object by 
an icon or the like. Alternatively, the image processing 
apparatus generates time information by a character 
generator on the basis of measured time information, 
replaces time information included in the recorded dig- 
ital data by the time information generated by the char- 
acter generator, and reproduces and displays the re- 
placed time information. 



FIG. 12 




S06 
f 



DIRECTLY DISPLAY AND 
OUTPUT PLAYBACK IMAGE 



Q. 
UJ 



Printed by Jouvc, 75001 PARIS (FR) 



->onrjrv <fp 



1 EP 1 124 379 A2 2 



Description 

FIELD OF THE INVENTION 

[0001] The present invention relates to an image 5 
processing method and apparatus suitable in recording, 
reproducing, and displaying a digital television program, 
and is applicable to a recording/playback device for re- 
ceiving and recording a digital television program, and 
a television receiver, television display device, or the like 
having such a recording function. 

BACKGROUND OF THE INVENTION 

[0002] In recent years, digital television broadcasting 
using satellites or cable broadcasting is becoming pop- 
ular. With realization of this digital broadcasting, expec- 
tations rise for further new development including im- 
provement in broadcast image and audio qualities, in- 
creases in the number of types of programs and the in- 
formation amount using a compression technique, pro- 
vision of new services such as an interactive service, 
evolution of the reception form. 

[0003] Fig. 17 is a block diagram showing the ar- 
rangement of a conventional digital broadcasting recep- 
tion apparatus using satellite broadcasting. 
[0004] In this reception apparatus, information trans- 
mitted by satellite broadcasting is received by an anten- 
na 1 , and the received television information is tuned 
and demodulated by a tuner 2 in a reception device 8. 
Then, the television information is subjected to error cor- 
rection processing (not shown), and if necessary, to 
charging correspondence, descrambiing processing, 
and the like. Various data multiplexed as the TV infor- 
mation are demultiplexed into individual data by a mul- 
tiplexed-signal demultiplexing circuit 3. The demulti- 
plexed data include image information, audio informa- 
tion, and other additional data. These demultiplexed da- 
ta are decoded by a decoding circuit 4. Of the decoded 
data, the image information and audio information are 
converted into analog signals by a D/A conversion cir- 
cuit 5. The image and audio are respectively displayed 
and output by a TV receiver 6 serving as an externally 
connected display device. Note that the additional data 
concerns various functions in order to function as pro- 
gram sub-data. 

[0005] A satellite TV program is recorded and played 
back by a recording/playback device (DVD/VTR) 7. Ex- 
amples of the recording/playback device 7 are a record- 
able/playbackable DVD (Digital Video Disk drive) and 
digital VTR. The reception device 8 and recording/play- 
back device 7 are connected by a data bus and the like. 
The recording scheme in the recording/playback device 
7 is a digital recording scheme which performs bitstream 
recording. Note that bitstream recording is not limited to 
the use of the DVD or digital VTR (e.g., D-VHS type 
VTR), but is also supported by a DVC which is another 
consumer digital recording scheme. For example, even 



a digital recording device using various disk media can 
record a digital television program by format transfor- 
mation or the like, as needed. 

[0006] However, as a general method of displaying a 
television program on a home television, an image 
transmitted from a broadcasting station is directly dis- 
played in conventional ground wave broadcasting and 
even the above-described digital television broadcast- 
ing. Similarly, in playing back a television program re- 
corded by a VTR, the recorded data is directly played 
back. 

[0007] In other words, it is very difficult for the conven- 
tional technique to more effectively change the display 
form by the user in accordance with the situation in dis- 
play of a television program, playback/display of a VTR, 
or the like. This function is an effective display method 
in the future during the course of increasing the numbers 
of channels and programs in the development of digital 
television broadcasting, and is considered to be one of 
indispensables in terms of addition of new functions. 
This function, however, has not been realized yet. 
[0008] For example, as a display subject, a telop dis- 
play for special news such as "earthquake news infor- 
mation", which is important upon recording, is often in- 
significant in playing back the recorded television infor- 
mation by the recording/playback device 7, but the re- 
corded information is displayed without any change. 

SUMMARY OF THE INVENTION 

[0009] The present invention has been made in con- 
sideration of the conventional situation, and has as a 
concern to provide an image processing method and ap- 
paratus capable of improving the visual effect for the us- 
er, and improving a user interface. 
[0010] It is another concern of the present invention 
to provide an image processing method and apparatus 
capable of determining an object having an attribute in- 
significant in reproducing, and controlling reproducing 
of the object. 

[0011] It is still another concern of the present inven- 
tion to to provide an image processing method and ap- 
paratus capable of displaying a predetermined picture 
after changing the display form of the predetermined 
picture instead of displaying them as they are in repro- 
ducing recorded pictures. 

[0012] It is still another concern of the present inven- 
tion to to provide an image processing method and ap- 
paratus capable of not displaying an image which was 
significant upon recording a picture but is insignificant 
in reproducing it, or capable of changing the display form 
in display, as needed. 

[0013] In accordance with an aspect of the invention 
there is provided and image processing apparatus for 
reproducing a recorded digital data stream, comprising: 
determination means for determining whether an object 
having a predetermined attribute exists in the recorded 
digital data stream; and reproducing means for chang- 
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ing a reproducing form of the object and reproducing the 
object when the determination means determines that 
the object having the predetermined attribute exists. Ad- 
ditionally the invention provides an image processing 
apparatus for reproducing a recorded digital data 
stream and comprising : 

determination means for determining whether an 
object having a predetermined attribute exists in the re- 
corded digital data stream; designation means for des- 
ignating a reproducing form of the object having the pre- 
determined attribute from a plurality of reproducing 
forms; and reproducing control means for reproducing 
an image corresponding to the object having the prede- 
termined attribute in the reproducing form designated 
by the designation means when the determination 
means determines that the object having the predeter- 
mined attribute exists. 

[0014] Other features and advantages of the present 
invention will be apparent from the following description 
taken in conjunction with the accompanying drawings, 
in which like reference characters designate the same 
or similar parts throughout the figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0015] The accompanying drawings, which are incor- 
porated in and constitute a part of the specification, il- 
lustrate embodiments of the invention and, together with 
the description, serve to explain the principles of the in- 
vention. 

Fig. 1 is a block diagram showing the configuration 
of a display system according to an embodiment of 
the present invention; 

Fig. 2 is a block diagram showing the arrangement 
of a digital television broadcasting reception device 
according to the first embodiment of the present in- 
vention; 

Fig, 3 is a block diagram showing the arrangement 
of a recording/playback device according to the first 
embodiment of the present invention; 
Fig. 4 is a block diagram showing the arrangement 
of a display device according to the first embodi- 
ment of the present invention; 
Fig. 5 is a block diagram for explaining the arrange- 
ment of the recording/playback device in Fig. 3 in 
more detail; 

Fig. 6 is a view for explaining the bitstream structure 
of MPEG4 data; 

Fig. 7 is a conceptual view for explaining the ar- 
rangement of object information contained in the 
bitstream of MPEG4 data; 

Fig. 8 is a view for explaining display switching be- 
tween a normal image and a replaced image; 
Figs. 9A and 9B are views showing display exam- 
ples in -the embodiment; 

Figs. 10A and 10B are views showing display ex- 
amples in the embodiment; 



Figs. 1 1 A and 1 1 B are views showing display exam- 
ples in the embodiment; 

Fig. 12 is a flow chart for explaining an operation 
sequence according to the first embodiment of the 

5 present invention; 

Fig. 1 3 is a block diagram showing the arrangement 
of a recording/playback device according to the 
second embodiment of the present invention; 
Fig. 1 4 is a block diagram showing the arrangement 

10 of a display device according to the second embod- 
iment of the present invention; 
Fig. 15 is a flow chart for explaining an operation 
sequence according to the second embodiment of 
the present invention; 

*5 Fig. 1 6 is a view for explaining the transport stream 
structure of MPEG2 data according to another em- 
bodiment; 

Fig. 1 7 is a block diagram showing the configuration 
of a conventional digital television broadcasting re- 

20 ception system; 

Fig. 1 8 is a block diagram for explaining an MPEG4 
coding/decoding processing flow; 
Fig. 1 9 is a block diagram showing an arrangement 
considering user operation (edit) in an MPEG4sys- 

25 tern; 

Fig. 20 is a block diagram for explaining a VOP 
processing circuit block on the coding side; 
Fig. 21 is a block diagram for explaining a VOP 
processing circuit block on the decoding side; 

30 Fig. 22 is a block diagram showing the overall ar- 
rangement of VOP coding and decoding; 
Figs. 23A and 23B are views for explaining informa- 
tion constituting VOP in which Fig. 23A shows an 
information structure in coding in units of objects, 

35 and Fig. 23B shows an information structure in cod- 
ing in units of frames; 

Figs. 24A and 24B are views for explaining scala- 
bility in hierarchical coding, in which Fig. 24A shows 
temporal scalability, and Fig. 24B shows spatial 
^0 scalability; 

Figs. 25A and 25B are views for explaining warp 
which expresses viewpoint movement in a three-di- 
mensional space, such as image movement, rota- 
tion, enlargement, or deformation; 
Fig. 26 is a view showing an example of a sprite 
image; 

Fig. 27 is a view for explaining an arrangement of 
scene description information; 
Fig. 28 is a table showing the type of MPEG4 audio 
50 coding scheme; 

Fig. 29 is a block diagram for explaining an audio 
signal coding scheme; 

Fig. 30 is a view for explaining generation of a coded 
bitstream in MPEG4; 
55 Fig. 31 is a view for explaining an MPEG4 layer 

structure; 

Figs. 32A and 32B are views for explaining bidirec- 
tionally decodable variable-length coding; 
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Figs. 33A and 33B are views for explaining en- 
hancement of error robustness in MPEG4; 
Fig. 34 is a conceptual view for explaining the ar- 
rangement of object information according to the 
third embodiment of the present invention; 
Fig. 35 is a block diagram showing the arrangement 
of a recording/playback device according to the 
third embodiment in detail; 

Fig. 36 is a view for explaining a change of the play- 
back form in an object controller in Fig. 35; 
Figs. 37A to 37D are views for explaining display 
examples according to the third embodiment of the 
present invention; 

Figs. 38A to 38C are views for explaining playback 
display/output examples according to the third em- 
bodiment of the present invention; 
Fig. 39 is a flow chart for explaining a playback 
processing sequence according to the third embod- 
iment of the present invention; 
Fig. 40 is a block diagram showing the arrangement 
of a recording/playback device according to the 
fourth embodiment of the present invention; 
Fig. 41 is a block diagram showing the arrangement 
of a display device according to the fourth embodi- 
ment; and 

Fig. 42 is a flow chart for explaining a playback 
processing sequence according to the fourth em- 
bodiment of the present invention. 

DETAILED DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

[001 6] To solve conventional problems, embodiments 
of the present invention propose new functions as an 
effective playback(reproducing)/display method for a 
digital television broadcasting reception/display method 
and the playback(reproducing)/display method of a re- 
cording/playback (reproducing) device of recording and 
playing back (reproducing) television information. The 
new functions are realized by constituting a broadcast- 
ing system using MPEG4 which is being standardized 
recently, in addition to MPEG2 adopted as a convention- 
al digital television broadcasting coding scheme. 
[0017] Details of MPEG4 will be described later. The 
use of the concept of an object, which is a characteristic 
feature of MPEG4 coding, enables output control and 
display control in units of objects in an image. A device 
for recording/playing back or displaying an MPEG4 tel- 
evision program can display image data having prede- 
termined object attribute data by changing its display 
form from the that of the original. 

[0018] For example, when predetermined object in- 
formation is a real-time image object (e.g., an object 
such as time display or weather forecast which is useful 
only when the image was broadcast), control can be 
done not to display the real-time image object in the re- 
corded image without playing back as it is and displaying 
the recorded image in the display form of the original 



6 

recorded in the past in playing back television informa- 
tion by a recording/playback device, and control can be 
done to change, e.g., time display in correspondence 
with the current time and display the changed time (re- 

5 placement processing). 

[0019] The object includes the background or speaker 
of an image, a CG image, and the speech of a speaker. 
The M PEG4 coding scheme is to code/decode an image 
in units of objects, combine objects, and express one 

to scene. 

[0020] As an example of the display control function 
according to the embodiment, the display form of an ob- 
ject formed from predetermined object information is 
changed between recording (original image) and play- 
's back on the basis of attribute information (object infor- 
mation) defined for each object in a device of recording 
and playing back MPEG4 information or a device of dis- 
playing played-back information. 

[0021] By realizing the embodiment, display of real- 
20 time information such as time information in broadcast- 
ing can be easily changed in correspondence with the 
current time. This is effective in adding a new function 
to recording/playback of a television program. 
[0022] A preferred embodiment of the present inven- 
ts tion will be described in detail below with reference to 
the accompanying drawings. 

[0023] In the embodiment, an image signal coded by 
the MPEG4 coding scheme is received, recorded, and 
played back. The MPEG4 technique will be explained in 
30 detail for respective fields. 

<Overall Arrangement of Standard> 

[0024] The MPEG4 standard is roughly made up of 
35 four items, three items of which are similar to those of 
MPEG2 and are a visual part, audio part, and system 
part. 

(1) Visual Part 

40 

[0025] An object coding scheme of processing a nat- 
ural image, synthesized image, moving image, and still 
image is standardized. 

[0026] This object coding scheme includes a coding 
45 scheme, sync playback function, and hierarchical cod- 
ing suitable for correction and repair of a transmission 
line error. In terms of expression, "video" means a nat- 
ural image, and "visual" includes up to a synthesized 
image. 

50 

(2) Audio Part 

[0027] An object coding scheme of processing a nat- 
ural sound, synthesized sound, sound effect, and the 
55 like is standardized. In the video and audio parts, a plu- 
rality of coding schemes are defined, and a compression 
scheme suitable for the feature of each object is appro- 
priately selected to increase the coding efficiency. 
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(3) System Part 

[0028] Multiplexing processing for coded picture and 
audio objects, and demultiplexing processing are de- 
fined. A buffer memory, time axis control, and a read- 
justment function are also included in this part. 
[0029] Picture and audio objects coded in parts (1) 
and (2) are integrated into a multiplexed stream in the 
system part together with scene building information 
which describes the position, appearance time, and dis- 
appearance time of an object in a scene. 
[0030] As decoding processing for a coded signal, 
each object is demultiplexed/decoded from a received 
bitstream, and a scene is reconstructed based on scene 
building information. 

<Object Coding> 

[0031] MPEG2 processes a frame or field as a unit. 
To the contrary, MPEG4 processes picture data and au- 
dio data as objects in order to realize reuse and edit of 
contents. 

[0032] The types of objects are as follows. 
Speech 

Natural Image (Background image: Two-dimen- 
sional Fixed Image) 

Synthesized Image (Principal Object Image: No 

Background) 

Character Information 

[0033] Fig. 18 is a block diagram showing a system 
configuration when these objects are simultaneously in- 
put and coded. 

[0034] The objects are respectively coded by a 
speech object encoder 5001 , natural image object en- 
coder 5002, synthesized-image object encoder 5003, 
and character object encoder 5004. At the same time, 
the relationship of these objects in the scene is coded 
as scene building information by a scene description in- 
formation encoder 5005, and coded into an MPEG4 bit- 
stream by a data multiplexer 5006 together with the 
pieces of coded object information. 
[0035] On the coding side, a combination of visual and 
audio objects is defined to express one scene (frame). 
Visual objects can constitute a scene as a combination 
of a natural image and a synthesized image such as a 
computer graphic. 

[0036] With this arrangement, an object image and 
speech can be played back using, e.g., a text to speech 
synthesis function in synchronism with each other. In ad- 
dition, an MPEG4 bitstream can be transmitted/received 
or recorded/played back. 

[0037] Decoding processing of a coded bitstream is 
reverse to coding processing. That is, a data demulti- 
plexer 5007 demultiplexes and distributes an MPEG4 
bitstream in units of objects. Respective objects such as 
a speech, natural image, synthesized image, and char- 



acter are decoded into object data by corresponding de- 
coders 5008 to 5011. Scene description information is 
also simultaneously decoded by a scene description de- 
coder 5012. A scene synthesizer 5013 synthesizes 
5 again an original scene using these pieces of decoded 
information. 

[0038] On the decoding side, the positions of visual 
objects or the order of audio objects in a scene can be 
changed. The object positions can be changed by drag 
10 operation. The language or the like can be changed by 
changing an audio object by the user. 
[0039] To synthesize a scene by freely combining a 
plurality of objects, the following four items are pre- 
scribed. 

15 

(a) Object Coding 

[0040] A visual object, audio object, and AV (Audio 
Visual) object as a combination of them are coded. 

20 

(b) Scene Synthesis 

[0041] A language as a modification of VRML (Virtual 
Realty Modeling Language) is used to define scene 
25 building information and a synthesis scheme for consti- 
tuting visual, audio, and AV objects into a desired scene. 

(c) Multiplexing and Synchronization 

30 [0042] For example, the form of a stream (elementary 
stream) obtained by multiplexing and synchronizing ob- 
jects is determined. 

[0043] This stream can be supplied to a network, and 
the QOS (Quality Of Service) in storing the stream in a 
35 recording device can also be set. QOS parameters are ; 
transmission path conditions such as a maximum trans- :f 
mission rate, error rate, and transmission scheme, the 
decoding ability, and the like. 



[0044] A scheme of synthesizing visual and audio ob- 
jects on the user terminal side is defined. 
[0045] An MPEG4 user terminal demultiplexes data 
transmitted from a network or recording device into el- 
ementary streams, which are decoded in units of ob- 
jects. A scene is reconstructed from a plurality of coded 
data on the basis of simultaneously transmitted scene 
building information. 

[0046] Fig. 19 shows a system configuration which 
considers user operation (edit). Fig. 20 is a block dia- 
gram showing a VOP processing circuit concerning a 
video object on the coding side, and Fig. 21 is a block 
diagram showing the decoding side. 

• VOP (Video Object Plane) 

[0047] In coding an MPEG4 image, target picture ob- 
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jects are coded separately for the shape and texture. 

The picture data unit is called VOR 

[0048] Fig. 22 is a block diagram showing the overall 

VOP coding/decoding arrangement. For example, when 

an image is made up of two, person and background 

objects, each frame is divided into two VOPs, which are 

coded. 

[0049] As shown in Fig. 23A, information constituting 
each VOP is object shape information, motion informa- 
tion, ortexture information. The decoder separates a bit- 
stream into VOPs, individually decodes them, and syn- 
thesizes them to display an image. 
[0050] With the use of the VOP structure, when an im- 
age to be processed is made up of a plurality of picture 
objects, the image can be divided into a plurality of 
VOPs to individually code/decode them. If the number 
of VOPs is "1 ", and the object shape is rectangular, con- 
ventional coding in units of frames is performed, as 
shown in Fig. 23B. 

[0051] The VOP employs three prediction schemes, 
i.e., intra-coding (l-VOP), predictive coding (P-VOP), 
and bidirectionally predictive coding (B-VOP). The pre- 
diction unit in the prediction scheme is a macroblock of 
16x16 pixels. 

[0052] A bidirectional prediction VOP (B-VOP) is a 
VOP which is bidirectionally predicted from past and fu- 
ture VOPs, similar to a B picture of MPEG1 or MPEG2. 
Four modes, direct coding, forward coding, backward 
coding, and bidrectional coding can be selected in units 
of macroblocks. 

[0053] Bidirectional predictive coding can switch the 
mode in units of MBs or blocks, and bidirectional pre- 
diction is done by scaling of the motion vector of a P- 
VOP. 

<Shape Coding> 

[0054] To process an image in units of objects, the ob- 
ject shape must have already been known in coding and 
decoding. To express an object such as glass through 
which another object is seen, information representing 
the transparency of the object is required. The object 
shape and object transparency information are called 
shape information. Coding of the shape information is 
called shape coding. 

<Size Transformation Processing> 

[0055] Binary shape coding is a method of coding the 
inner or outer boundary of an object for each pixel. As 
the number of pixels to be coded is smaller, the coding 
amount is smaller. However, if the macroblock size to 
be coded is decreased, a coded original shape is de- 
graded and transmitted to the receiving side. To prevent 
this, the degree of degradation of original information by 
size transformation is measured, and a smaller size is 
selected as far as a size transformation error is a pre- 
determined threshold or less. Examples of the size 



transformation ratio are three, one-to-one ratio, 1/2 as- 
pect ratio, and 1/4 aspect ratio. 

[0056] Shape information of each VOP is given as an 
8-bit value a, and defined as follows. 

5 

a = 0: outside of a°corresponding VOP 

ot = 1 to 254: displayed in a sem transparent state 

with another VOP 

a = 255: display region of only a corresponding VOP 

10 

[0057] Binary shape coding is executed when the val- 
ue a takes only "0" or "255", and the shape is expressed 
by only the inside and outside of a corresponding VOP. 
Multivalued shape coding is executed when the value a 
is can take all the values "0" to "255". This coding can ex- 
press a semitransparent state in which a plurality of 
VOPs overlap each other. 

[0058] Similar to texture coding, motion compensa- 
tion prediction is performed at a precision of one pixel 
20 in units of blocks each made up of 1 6 x 1 6 pixels. When 
the entire object is subjected to intra-coding, shape in- 
formation is not predicted. The motion vector uses the 
difference of a motion vector predicted from an adjacent 
block. The difference value of the obtained motion vec- 
25 tor is coded and then multiplexed into a bitstream. In 
MPEG4, shape information in units of motion compen- 
sation-predicted blocks is coded into a binary shape. 

<Feathering> 

30 

[0059] Feathering (smoothing of the boundary shape) 
is used when the boundary is smoothly changed from 
an opaque portion to a transparent portion even in a bi- 
nary shape. Feathering includes a linear feathering 
35 mode in which a boundary value is linearly interpolated, 
and a feathering filter mode using a filter. A constantly 
opaque multivalued shape has a constant a mode : and 
can be combined with feathering. 



[0060] The luminance component and color differ- 
ence components of an object are coded, and proc- 
essed in the order of DOT, quantization, predictive cod- 
ing, and variable length coding in units of fields/frames. 
[0061] DOT uses a block of 8 x 8 pixels as a process- 
ing unit. When an object boundary is within a block, pix- 
els outside the object are compensated by the average 
value of the object. Processing with a 4-tap two-dimen- 
sional filter prevents generation of a high pseudo peak 
in a DOT transformation coefficient. 
[0062] Quantization adopts a quantization unit based 
on the ITU-T recommendation H.263 or an MPEG2 
quantization unit. The use of the MPEG2 quantization 
unit enables nonlinear quantization of a DC component 
and frequency weighting of an AC component. 
[0063] The intra-coding coefficient after quantization 
is predictive-coded between blocks before variable 



40 <Texture Coding> 
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length coding to delete a redundant component. Partic- 
ularly in MPEG4, both DC and AC components are pre- 
dictive-coded. 

[0064] In AC/DC predictive coding in texture coding, 
the differences (gradients) of corresponding quantiza- 5 
tion coefficients between adjacent blocks are checked, 
and a smaller coefficient is used for prediction. In coding 
a DC coefficient x, c is used for prediction for la-bl < Ib- 
cl, or a is used for la-bl ^ Ib-cl. 
[0065] In predicting an AC coefficient x, a prediction 
value is selected similarly to the DC coefficient, and nor- 
malized by the quantization scale value (QP) of each 
block. 

[0066] In predictive coding of a DC component, the 
difference (vertical gradient) between the DC compo- 
nents of vertically adjacent blocks and the difference 
(horizontal gradient) between the DC components of 
horizontally adjacent blocks are checked between adja- 
cent blocks, and the difference between the DC compo- 
nents of blocks in a direction in which the gradient de- 
creases is coded as a prediction error. 
[0067] In predictive coding of an AC component, a 
corresponding coefficient value of ah adjacent block is 
used in accordance with predictive coding of a DC com- 
ponent. The quantization parameter value may change 
between blocks, so that the difference is calculated up- 
on normalization (quantization step scaling). The pres- 
ence/absence of prediction can be selected in units of 
macroblocks. 

[0068] The AC component undergoes three-dimen- 
sional (Last, Run f Level) variable length coding afterzig- 
zag scan. In this case, Last is a 1-bit value other than 
"0" that represents the end of the coefficient, Run is a 
"0" successive length, and Level is a non-zero coeffi- 
cient value. 

[0069] Variable length coding of an intra-coded DC 
component uses either a DC component variable length 
coding table or an AC component variable length table. 

<Motion Compensation> 

[0070] MPEG4 can code a VOP (Video Object Plane) 
having an arbitrary shape. The VOP has intra-coding (I- 
VOP), predictive coding (P-VOP), and bidirectionally 
predictive coding (B-VOP) depending on the type of pre- 
diction. The prediction unit is a macroblock of 16 lines x 
1 6 pixels or 8 lines x 8 pixels. For this reason, a given 
macroblock exists across the boundary of a VOP. To in- 
crease the prediction efficiency of the VOP boundary, 
padding (compensation) and polygon matching (match- 
ing of only an object) are performed for a macroblock 
on the boundary. 

<Wavelet Coding> 

[0071] Wavelet transformation is a transformation 
scheme in which a plurality of functions obtained by en- 
larging/reducing/translating one solitary wave function 



are used as a transformation basis. A still image coding 
mode (texture coding mode) using wavelet transforma- 
tion is suitable as a high-quality coding scheme having 
various spatial resolutions ranging from high to low res- 
olutions especially when synthesizing CG and natural 
images. 

[0072] As the effects of wavelet coding, an image can 
be coded at once without any block division, thus no 
block distortion is generated even at a low bit rate, and 
mosquito noise can be decreased. In this manner, wide 
scalability from a low-resolution, low-quality image to a 
high-resolution, high-quality image, processing com- 
plexity, and tradeoff of the coding efficiency can be se- 
lected in accordance with an application in the MPEG4 
still image coding mode. 

hierarchical Coding (Scalability)> 

[0073] To realize scalability, a syntax hierarchical 
structure as shown in Figs. 24A and 24B is formed. 
[0074] Hierarchical coding is realized by using, e.g., 
a base layer as a lower layer and an enhancement layer 
as a higher layer, and coding "difference information" of 
improving the image quality of the base layer by the en- 
hancement layer. 

[0075] In spatial scalability, the base layer represents 
a low-resolution moving image, and (base layer + en- 
hancement layer) represents a high-resolution moving 
image. 

[0076] Hierarchical coding not only hierarchically im- 
proves the quality of an entire image, but also improves 
the quality of only an object region in the image. For ex- « 
ample, for temporal scalability, the base layer is ob- 
tained by encoding an entire image at a low frame rate, 
and the enhancement layer is obtained by encoding da- 
ta for increasing the frame rate of a specific object within 
the image. 

[Temporal Scalability: Fig, 24A] 

[0077] The temporal scalability hierarchically sets the 
frame rate, and the frame rate of the object of an en- 
hancement layer can be increased. The presence/ab- 
sence of the hierarchy can be set in units of objects. 
There are two types of enhancement layers: Type 1 is 
formed from part of the object of a base layer, and Type 
2 is formed from the same object as that of a base layer. 

[Spatial Scalability: Fig. 24B] 

[0078] Spatial scalability hierarchically sets the spa- 
tial resolution. The base layer can be down-sampled to 
an arbitrary size. The base layer is used for prediction 
of an enhancement layer. 

<Sprite Coding> 

[0079] A sprite is a planar object which can be entirely 
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expressed by uniform movement, rotation, deformation, 
or the like, such as a background in an image in the 
three-dimensional space. A method of coding a planar 
object is called sprite coding. 

[0080] Sprite coding processes are classified into 
four, static, dynamic, online, and offline types. More spe- 
cifically, object data is sent to a decoder in advance, and 
only a global motion coefficient is transmitted in real 
time. Sprites include a static sprite obtained by direct 
transformation of a template, a dynamic sprite obtained 
by predictive coding from a temporally preceding sprite, 
an offline sprite which is encoded by intra-coding (I- 
VOP) in advance and transmitted to the decoder side, 
and an online sprite simultaneously created by an en- 
coder and decoder during coding. 
[0081] Techniques examined for sprite coding are 
schemes (tools) such as static sprite coding, dynamic 
sprite coding, and global motion compensation. 

[Static Sprite Coding] 

[0082] In static sprite coding, the background (sprite) 
of an entire video clip is coded in advance, and part of 
the background is geometrically transformed to express 
an image. The image of the cut part can be variously 
deformed, e.g., translated, enlarged, reduced, and ro- 
tated. As shown in Fig. 25B, expressing viewpoint 
movement in the three-dimensional space, such as im- 
age movement, rotation, enlargement, and deformation, 
is called warp. 

[0083] Fig, 25A shows the types of warp. The types 
of warp include perspective transformation, affine trans- 
formation, isotropic enlargement (a)/rotation (G)/move- 
ment (cj), and translation. These methods are given by 
equations in Fig. 25A, and respective coefficients can 
represent movement, rotation, enlargement, deforma- 
tion, and the like. A sprite is generated in an offline state 
before the start of coding. 

[0084] In this way, static sprite coding is realized by 
cutting a partial region of a background image, and 
warping and expressing the region. 
[0085] Fig. 26 is a view showing an example of a sprite 
image. A surrounded partial region in the entire back- 
ground image is warped. More specifically, this back- 
ground includes a background image such as an audi- 
torium in a tennis match, and the warped portion in- 
cludes an image with a motion part such as a player. In 
static sprite coding, only a geometric transformation pa- 
rameter is coded without coding any prediction error. 

[Dynamic Sprite Coding] 

[0086] A sprite is generated before coding in the static 
sprite coding scheme, while a sprite can be updated in 
an online state during coding in the dynamic sprite cod- 
ing scheme. The dynamic sprite coding is different from 
static sprite coding in that a prediction error is coded. 



[Global Motion Compensation (GMC)] 

[0087] Global motion compensation is a technique of 
expressing the motion of an entire object by one motion 
5 vector and compensating for the motion without dividing 
the motion into blocks. This technique is suitable for mo- 
tion compensation of a rigid body or the like. Global mo- 
tion compensation is the same as static sprite coding in 
that a reference image is an immediately preceding de- 
coded image instead of a sprite, and that a prediction 
error is coded. However, global motion compensation is 
different from static sprite coding and dynamic sprite 
codin j in that neither a memory for storing a sprite nor 
snaps information are required. This is effective for the 
motion of an entire frame, an image including a zoomed 
image, and the like. 

<Scene Structure Description lnformation> 

[0088] Objects are synthesized based on scene build- 
ing information. In MPEG4, building information for syn- 
thesizing objects into a scene is transmitted. Upon re- 
ception of individually coded objects, they can be syn- 
thesized into a scene intended by the transmitting side 
by using scene building information. 
[0089] This scene building information contains the 
display time and display position of an object. The dis- 
play time and display position are described as tree-like 
node information. Each node has relative time informa- 
tion and relative space coordinate position information 
on the time axis with respect to the parent node. 
[0090] As a language for describing the scene build- 
ing information, there are BIFS (Binary Format for 
Scenes) as a modification of VRML, and AAVS (Adap- 
tive Audio-Visual Session Format) using Java. BIFS de- 
scribes MPEG4 scene building information by binary in- 
formation. AAVS is based on Java, has a high degree 
of freedom, and compensates for BIFS. 
[0091] Fig. 27 shows a structure of scene description 
information. 

<Scene Description> 

[0092] A scene is described by BIFS (Binary Format 
for Scenes) . In this case, a scene graph and node as a 
concept common to VRML and BIFS will be mainly ex- 
plained. The node designates grouping of lower nodes 
having attributes such as a light source, shape, material, 
color, and coordinates, and subjected to coordinate 
transformation. The object-oriented concept is adopted, 
and the layout and viewing of objects in the three-dimen- 
sional space are determined by tracing a tree called a 
scene graph from the top node and inheriting the at- 
tributes of higher nodes. If a leaf node is synchronously 
assigned a media object, e.g., MPEG4 video bitstream, 
a moving image can be synthesized with other graphics 
in the three-dimensional space and output. 
[0093] The difference from VRML is as follows. 
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[O094] The MPEG4 system supports by BIFS: 

1 . Two-dimensional overlap relationship description 
of MPEG4 video VOP coding, and MPEG4 audio 
synthetic description 

2. Synchronous processing of successive media 
streams 

3. Expression of dynamic behavior of an object (e. 
g., sprite) 

4. Standardization of the transmission form (binary) 

5. Dynamic change of scene description during a 
session 

[0095] Nearly all the VRML nodes are supported by 
BIFS except that Extrusion, Script, Proto, Extern Proto, 
and the like among VRML nodes are not supported. 
[0096] Special MPEG4 nodes newly added by BIFS 
are as follows. 

1 . Node for 2D/3D synthesis 

2. Node for 2D graphics and text 

3. Animation node 

4. Audio node 

[0097] It should be noted that VRML does not support 
2D synthesis except for a special node such as a back- 
ground, while BIFS expands description so as to proc- 
ess a text, graphic overlay, and MPEG4 video VOP cod- 
ing in units of pixels. 

[0098] At the animation node, a special node for an 
MPEG4 CG image such as a 3D mesh face is defined. 
A message (BIFS Update) capable of dynamically per- 
forming replacement, delete, addition, and attribute 
change of a node in a scene graph allows displaying a 
new moving image and adding a button on a frame dur- 
ing a session. BIFS can be realized by replacing a VRML 
reserved word, node identifier, and attribute value by al- 
most one-to-one binary data. 

<MPEG4 Audio> 

[0099] Fig. 28 is a table showing the type of MPEG4 
audio coding scheme. 

[0100] Audio & audio coding includes parametric cod- 
ing, CELP coding, and time/frequency conversion cod- 
ing. Further, an SNHC audio function is also adopted, 
and this coding also includes SA (Structured Audio) cod- 
ing and TTS (Text To Speech) coding. S A is a structural 
description language for synthesized speech including 
MIDI. TTS is a protocol for transmitting intonation or 
phonemic information to an external text to speech syn- 
thesizer. 

[0101] Fig. 29 is a block diagram showing the ar- 
rangement of an audio coding scheme. 
[0102] In Fig. 29, an input speech signal is pre-proc- 
essed (201), and divided in signal division 202 in ac- 
cordance with the band so as to properly use three en- 
coders, i.e., parametric encoder, CELP encoder, and 



time/frequency encoder. The divided signals are re- 
spectively input to appropriate encoders. In signal anal- 
ysis control 203, the input speech signal is analyzed to 
generate control information or the like for classification 

5 to the respective encoders in accordance with the sig- 
nal. Subsequently, a parametric coding core 204, CELP 
coding core 205, and time/frequency conversion coding 
core 206 as different encoders execute coding process- 
ing based on respective coding schemes. The three 

10 coding schemes will be explained later. Of the coded 
audio data, outputs from the parametric coding core 204 
and CELP coding core 205 are input to a small-step en- 
hancing circuit 207. Outputs from the time/frequency 
conversion coding core 206 and small-step enhancing 

15 circuit 207 are input to a large-step enhancing circuit 
208. The small- and large-step enhancing circuits 207 
and 208 are tools for decreasing distortion generated in 
coding processing of each coding core. Audio data out- 
put from the large-step enhancing circuit 208 is a coded 

20 speech bitstream. 

[0103] The arrangement of the audio coding scheme 
in Fig. 29 has been made. 

[0104] The respective coding schemes will be ex- 
plained with reference to Fig. 28. 

25 

<Parametric Coding> 

[01 05] Speech and tone signals are expressed as pa- 
rameters such as the frequency, amplitude, and pitch, 
30 and coded. Parametric coding includes HVXC (Harmon- 
ic Vector Excitation Coding) coding for a speech signal, 
and IL (Individual Line) coding for a tone signal. 

<HVXC Coding> 

35 

[0106] HVXC coding mainly targets on speech coding 
at 2 kbits/sec to 4 kbits/sec. Speech signals are classi- 
fied into voiced and unvoiced sounds. For a voiced 
sound, the harmonic structure of the residual signal of 
40 an LPC (Linear Prediction Coefficient) is vector-quan- 
tized. For an unvoiced sound, a prediction residue di- 
rectly undergoes vector excitation coding. 

<IL Coding> 

45 

[0107] IL coding targets on tone coding at 6 kbits/sec 
to 16 kbits/sec. A signal is modeled by a line spectrum, 
and coded. 



[0108] CELP coding is a scheme of coding an input 
speech signal by dividing it into spectral envelope infor- 
mation and sound source information (prediction error) . 
55 Spectral envelope information is represented by a linear 
prediction coefficient calculated by linear prediction 
analysis from an input speech signal. 
[0109] MPEG4 CELP coding includes narrow-band 



30 



45 



so <CELP (Code Excited Linear Prediction) Coding> 



9 



17 



EP 1 124 379 A2 



18 



CELP having a bandwidth of 4 kHz, and wide-band 
CELP having a bandwidth of 8 kHz. NB (Narrow Band) 
CELP can select a bit rate between 3.85 kbits/sec and 
1 2.2 kbits/sec, and WB (Wide Band) CELP can select a 
bit rate between 13.7 kbits/sec and 24 kbits/sec. 5 

<T/F (Time/Frequency) Conversion Coding> 

[0110] T/F conversion coding is a coding scheme for 
high speech quality. This coding includes a scheme 10 
complying with AAC (Advanced Audio Coding), and 
TwinVQ (Transform-domain Weighted Interleave Vector 
Quantization). 

[0111] An auditory psychological model is assembled 
in the T/F conversion coding arrangement, and subject- is 
ed to adaptive quantization using an auditory masking 
effect. 

<AAC-Compliant Scheme> 

20 

[0112] An audio signal is converted into a frequency 
by DCT or the like, and subjected to adaptive quantiza- 
tion using an auditory masking effect. The adaptive bit 
rate ranges from 24 kbits/sec to 64 kbits/sec. 

25 

<TwinVQ Scheme> 

[0113] The M DCT coefficient of an audio signal is flat- 
tened using a spectral envelope obtained by performing 
linear prediction analysis for the audio signal. After in- 30 
terleaving, vector quantization is executed using two 
code lengths. The adaptive bit rate ranges from 6 kbits/ 
sec to 40 kbits/sec. 

<System Configuration> 35 

[0114] In the MPEG4 system part, multiplexing, de- 
multiplexing, and composition are defined, which will be 
described with reference to Fig. 30. 

[0115] In multiplexing, each elementary stream such 40 
as an object as an output from a picture or audio encod- 
er, or each scene building information which describes 
the time-space arrangement is packeted by an access 
unit layer. The access unit layer adds as a header a time 
stamp and reference clock for establishing synchroni- 45 
zation for each access unit. A packeted stream is mul- 
tiplexed by a FlexMux layer in the display or error robust 
unit, and sent to a TransMux layer. 
[0116] In the TransMux layer, a protection sub-layer 
adds an error correction code in accordance with the so 
necessity of error robustness. At last, a Mux sub-layer 
transmits the resultant stream as one TransMux stream 
to a transmission path. The TransMux layer is not de- 
fined in MPEG4, andean utilize UDP/IP (User Datagram 
Protocol/Internet Protocol) as an Internet protocol, or an 55 
existing network protocol such as MPEG2 TS (Transport 
Stream), ATM (Asynchronous Transfer Mode) AAL2 
(ATM Adaptation Layer 2), a video phone multiplexing 



scheme (ITU-T recommendation H.223) using a tele- 
phone circuit, or digital audio broadcasting. 
[0117] The access unit layer and FlexMux layer can 
be bypassed to decrease the overhead of the system 
layer and easily embed a conventional transport stream. 
[01 18] On the decoding side, a buffer (DB: Decoding 
Buffer) is disposed on the output stage of demultiplexing 
in order to synchronize objects, and absorbs the differ- 
ence in arrival time or decoding time between objects. 
Before composition, a buffer (CB: Composition Buffer) 
is arranged to adjust the display time. 

<Basic Structure of Video Stream> 

[0119] Fig. 31 shows a layer structure. 
[01 20] Each layer is called a class, and each class has 
a header. The header contains various kinds of coding 
information in addition to the start code, end code, ID, 
shape, and size. 

[Video Stream] 

[0121] A video stream is made up of a plurality of ses- 
sions. The session is a closed sequence. 
[0122] [VS] (Video Session) is made up of a plurality 
of objects. 

[0123] [VO] (Video Object) 

[01 24] [VOL] (Video Object Layer) is an object unit se- 
quence including a plurality of layers. 
[0125] [GOV] (Group Of Video object plane) is made 
up of a plurality of planes. 

[0126] The plane (object for each frame) has an error 
robust bitstream structure. 

[0127] In MPEG4, the coding scheme itself has trans- 
mission error robustness so as to cope with mobile com- 
munication (radio communication). In the conventional 
standard scheme, however, error correction is mainly 
done on the system side. In a PHS network or the like, 
the error rate is very high, and errors which cannot be 
completely corrected on the system side may leak to a 
video coded portion. 

[0128] Considering this, MPEG4 assumes various er- 
ror patterns which cannot be completely corrected on 
the system side, and realizes an error robust coding 
scheme which suppresses propagation of an error as 
much as possible even in this environment. 
[0129] A detailed error robust method for image cod- 
ing, and a bitstream structure therefor will be explained. 

(1) RVLC (Reversible VLC) and Two-Way Decoding 
(Figs. 32A, 32B) 

[0130] Fig. 32A is a view for explaining one-way de- 
coding by normal VLC. If mixing of an error is confirmed 
during decoding, decoding processing is suspended at 
that time. 

[0131] Fig. 32B is a view for explaining two-way de- 
coding processing. If mixing of an error is confirmed dur- 
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ing decoding, decoding processing is suspended, and 
the next sync signal is detected. Upon detecting this 
sync signal, the bitstream is decoded in an opposite di- 
rection from the suspended portion. Thus, the number 
of decoding start points increases without any new ad- 
ditional information, and the information amount deco- 
dable upon generation of an error can be increased, 
compared to a conventional scheme. A variable length 
code decodable in both forward and backward direc- 
tions can realize "two-way decoding". 

(2) Transmission of Important Information A Plurality of 
Number of Times (Fig. 33A) 

[0132] An arrangement capable of transmitting impor- 
tant information a plurality of number of times can be 
adopted to enhance error robustness. For example, dis- 
playing each VOP at a correct timing requires a time 
stamp, and this information is contained in the first video 
packet. Even if this video packet is lost due to an error, 
the structure enables resuming decoding processing 
from the next video packet. However, this video packet 
does not have any time stamp, so the display timing can- 
not bo attained. To prevent this, MPEG4 adopts a struc- 
ture capable of setting an HEC {Header Extension 
Code) flag in each video packet and adding important 
information such as a time stamp. After the HEC flag, a 
time stamp and VOP coding mode type can be added. 
[0133] If step-out of packets occurs, decoding starts 
from the next sync recovery marker (RM). In each video 
packet, necessary information (first MB number con- 
tained in the packet and a quantization step size for the 
MB) is set immediately after RM. After this information, 
an HEC flag is inserted. For HEC = 1 , TR and VCT are 
added immediately after HEC. With the pieces of HEC 
information, even if the start video packet fails in decod- 
ing and is discarded, a video packet having HEC = 1 
and subsequent video packets can be correctly decod- 
ed and displayed. Whether HEC is set to "1" can be free- 
ly set on the decoding side. 

(3) Data Partitioning (Fig. 33B) 

[01 34] On the encoder side, a bitstream is constituted 
by repeating coding processing in unit of MBs. If an error 
is mixed in the bitstream, subsequent MB data cannot 
be decoded. 

[0135] Assume that a plurality of pieces of MB infor- 
mation are classified into several groups, and pieces of 
MB information in the respective groups are arranged 
in a bitstream. In this case, marker information is assem- 
bled at the boundary of each group. Even if an error is 
mixed in the bitstream, and subsequent data fail in de- 
coding, synchronization is established at the marker at 
the end of the group, and data of the next group can be 
correctly decoded. 

[0136] A data partitioning method of grouping video 
packets into motion vectors and texture information 



(DCT coefficients or the like) on the basis of this concept 
is employed. A motion marker is set at the boundary be- 
tween the groups. Since a DCT coefficient after MM can 
be correctly decoded even if an error is mixed in motion 

5 vector information, MB data corresponding to a motion 
vector before mixture of the error can be accurately re- 
constructed together with the DCT coefficient. Even 
when an error is mixed in a texture portion, an image 
accurate to a certain degree can be interpolated and re- 

10 constructed (concealment) using motion vector informa- 
tion and preceding decoded frame information as far as 
the motion vector is accurately decoded. 

(4) Variable Length Interval Sync Scheme 

15 

[0137] A sync recovery method using a variable 
length packet will be explained. MBs with a sync signal 
at the start are called a "video packet", and the number 
of MBs contained in the video packet can be freely set 
20 on the encoding side. When an error is mixed in a bit- 
stream using a VLC (Variable Length Code), subse- 
quent codes cannot be synchronized and decoded. 
Even in this case, subsequent information can be cor- 
rectly decoded by detecting the next sync recovery 
25 marker. 

<Byte Alignment* 

[0138] A bitstream adopts a byte alignment structure 1 
30 so as to match a system in which information is multi- 
plexed by an integer multiple of bytes. To attain byte 
alignment, stuff bits are inserted at the end of each video : 
packet. These stuff bits are also used as an error check 
code in the video packet. 
35 [01 39] The stuff bits are formed from a code made up 
of "1"s except for the first bit "0", such as "01111". If up 
to the last MB in the video packet is correctly decoded, 
the next code is necessarily "0", and "1"s smaller in 
number by one bit than the stuff bit length must be suc- 
40 cessive. Hence, when a pattern which does not obey 
this rule is detected, previous decoding has not correctly 
been done, and mixture of an error in the bitstream can 
be detected. 

[0140] The MPEG4 technique has been described. 
45 This is described in "Outline of International Standard 
MPEG4 Was Determined", Nikkei Electronics Vol. 
1997.9.22, pp. 147 - 168, "Full View of MPEG4 Is Com- 
ing Into Sight", Text of the Institute of Image Information 
and Television Engineers, 1997.10.2, and "Recent 
so Standardization Trends and Image Compression Tech- 
nique of MPEG4", Japanese Industry Engineering Cent- 
er, 1997.2.3 Seminar Material. 

[First Embodiment] 

55 

[0141] An MPEG4 system according to the first em- 
bodiment of the present invention will be described. 
[01 42] Fig. 1 is a block diagram showing the schemat- 
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ic arrangement of a reception/playback system accord- 
ing to the first embodiment. The reception/playback sys- 
tem can receive a program such as a television program 
to display it on a display device. Further, the reception/ 
playback system can output a picture or audio played 
back by a recording/playback device such as a video 
recorder to a display device 13 to display or output the 
image or audio. 

[0143] In Fig. 1, reference numeral 11 denotes a tel- 
evision broadcasting reception device (TV receiver) for 
receiving an MPEG4 coding type digital television pro- 
gram; and 12, a recording/playback device which 
records and plays back a picture, audio, or the like, and 
corresponds to, e.g., a player for recording received tel- 
evision information on a recording medium such as a 
video tape or DVD, or playing back a picture, audio, or 
the like recorded on a recording medium. The display 
device 13 receives picture and audio signals, and out- 
puts them. The television broadcasting reception device 
1 1 is a reception tuner device such as an STB (Set-Top 
Box) , and the recording/playback device 12 is a home 
server, digital VTR, or the like using a DVD, HD (Hard 
Disk) , or the like. The representative product form of 
the display device 1 3 is a TV (television), display, or the 
like. Television broadcasting data received by the tele- 
vision broadcasting reception device 11 is displayed on 
the display device 1 3. A picture or audio recorded by the 
recording/playback device 12 and played back is dis- 
played on the display device 1 3. This is the basic oper- 
ation. 

[01 44] Fig. 2 is a block diagram showing the arrange- 
ment of the television broadcasting reception device 1 1 
according to the first embodiment. 
[0145] Digital television broadcasting data received 
by a satellite antenna 12 or by a cable television broad- 
casting terminal via a cable 1 3 is tuned by a tuner 1 4 or 
15, and adjusted for reception. One of television data 
received from satellite television broadcasting and cable 
television broadcasting is selected by a data selector 1 6, 
demodulated by a demodulation circuit 1 7, and subject- 
ed to error correction processing by an error correction 
circuit 1 8. 

[0146] An l/F (interface) 19 is a communication 
means for transmitting/receiving television broadcast- 
ing data, necessary command data, and the like to/from 
an external device. The l/F 1 9 is a representative digital 
communication interface. For example, the l/F 19 em- 
ploys an IEEE 1394 serial bus, and comprises a data 
transmission/reception processing circuit necessary for 
data communication, a connector for connecting a cable 
(bus), and the like. A system controller 20 controls the 
respective units of the television broadcasting reception 
device 11. Various user operation instructions and the 
like are input from an instruction input unit 21 having an 
input means such as a switch . The television broadcast- 
ing reception device 11 in Fig. 2 has been described. 
[0147] Fig. 3 is a block diagram showing the detailed 
arrangement of the recording/playback device 1 2 ac- 



cording to the first embodiment. 

[0148] Television broadcasting data and AV data are 
input/output via an l/F (interface) 31. The l/F 31 has 
compatibility which enables data communication be- 
5 tween the television broadcasting reception device 11 
and the display device 13. 

[0149] In receiving and recording a television pro- 
gram, television data transmitted from the television 
broadcasting reception device 11 is input via the l/F 31 , 

10 and subjected by a recording processing circuit 32 to 
recording processing of converting the television data 
into a data format suitable for a recording format and 
recording the converted data on a recording medium 33. 
The recording processing circuit 32 performs addition of 

*5 additional data such as an error correction code, and if 
necessary, data processing such as conversion of the 
compression scheme (format). The television data hav- 
ing undergone recording processing in the recording 
processing circuit 32 is recorded on the recording me- 

20 dium 33 with a recording head (not shown). 

[0150] In playing back image data recorded on the re- 
cording medium 33, video data (television data) record- 
ed on the recording medium 33 is played back with a 
playback head (not shown). The played video data un- 

25 dergoes data reconstruction and error correction by 
processing reverse to recording processing. 
[0151] The video data having undergone playback 
processing is decoded by a decoding scheme based on 
the MPEG4 coding scheme. The MPEG4 coding/decod- 

30 ing method has already been described. As the se- 
quence, various multiplexed data are demultiplexed into 
image data, audio data, and another system data by a 
multiplexed-data demultiplexing circuit 36. Each demul- 
tiplexed data is decoded by a decoder 37, and output 

35 processing of the decoded data is controlled by a dis- 
play/audio output controller 38. For each decoded ob- 
ject, an object replacement processor 41 executes ob- 
ject replacement processing as a characteristic feature 
of the first embodiment in which an object having a pre- 

40 determined attribute such as an actual time image ob- 
ject (time, weather forecast, or the like) in recording the 
original image is not displayed, or such an object is not 
displayed and is replaced by current information (cur- 
rent time or the like). The arrangements and operations 

45 of the decoder 37, display/audio output controller 38, 
and object replacement processor 41 , which play impor- 
tant roles in this processing, will be described in detail 
later. 

[0152] An output from the display/audio output con- 
50 trailer 38 including an output from the object replace- 
ment processor 41 is transmitted to the display device 
13 via the l/F 31. A system controller 39 controls the 
operations of the respective units of the apparatus in- 
cluding a servo processor 34 for controlling rotation of 
55 a recording medium and recording/playback operation, 
the display/output controller 38, and the object replace- 
ment processor 41 . When a command is transmitted 
from another device to the recording/playback device 
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1 2, the command input to the l/F 31 is transmitted to the 
system controller 39. The system controller 39 controls 
the operation in accordance with the command. A com- 
mand from the user is input from an instruction input unit 
40. 

[0153] The structure of the bitstream of an MPEG4 
digital television program will be explained. 
[0154] Fig. 6 is a view showing an MPEG4 bitstream. 
[0155] In Fig. 6, a data space from objects 1 to 5 (51 
to 55) contains a natural image object, an audio object, 
and a synthesized image object such as a computer 
graphic (to be referred to as a CG hereinafter) though 
the types of objects change depending on program con- 
tents and progress. For example, for a news program, 
the synthesized image objects are a background object 
(sprite), a person's image, another natural image object, 
a synthesized image object such as emergency news 
prompt report, weather information image, or time dis- 
play, and an audio object. In addition, scene description 
information 56 and additional data 57 are multiplexed 
as system data on the bitstream. The additional data 57 
contains time information 58, object information 59, and 
another information 60. The object information 59 con- 
tains a genre code 61 representing a genre to which 
each of objects corresponding to objects 1 to 5 (51 to 
55) belongs, an object code 62 representing details of 
the object, and a broadcasting station code 63 neces- 
sary for an object unique to the broadcasting station. 
[0156] In the first embodiment, the attribute of each 
object is determined from the genre code 61 , object 
code 62, and broadcasting station code 63 contained in 
the object information 59, and an object having a pre- 
determined attribute is subjected to processing of 
changing the object to another object in playback. 
[0157] The object information 59 will be explained 
with reference to the conceptual view of Fig. 7 for ex- 
plaining its image. 

[0158] Fig. 7 shows the image of the arrangement of 
the object information 59 ( and is a conceptual view of 
the arrangement of codes corresponding to respective 
broadcasting stations. The arrangement of the object in- 
formation 59 shown in Fig. 6 is classified and displayed, 
as shown in Fig. 7. 

[0159] The genre code 61 is information representing 
program contents such as "news", "professional base- 
ball", or "extra-long show". The object code 62 is infor- 
mation about display targets such as a "time display ob- 
ject"," weather image object", "person's image object",... 
for "news". The remaining genres "professional base- 
ball" and "extra-long show" are similarly constituted, as 
shown in Fig. 7. This data arrangement exists for each 
broadcasting station. A code representing this object in- 
formation arrangement is used for each broadcasting 
station or commonly to stations, and various objects are 
listed. A device on the broadcasting station side and a 
device on the viewer side are set to understand the 
same code. 

[0160] The operations of the decoder 37, display/au- 



dio output controller 38, and object replacement proc- 
essor 41 described in the arrangement of the recording/ 
playback device 12 in Fig. 3 will be described in detail, 
and object replacement processing will be exemplified. 
5 [0161] Fig. 5 is a block diagram showing the arrange- 
ment of the object replacement processor 41 . The same 
reference numerals as in Fig. 3 denote the same parts, 
and a description thereof will be omitted. 
[0162] In Fig. 5, video data having undergone play- 
back processing is demultiplexed by the multiplexed-da- 
ta demultiplexing circuit 36. Data are respectively de- 
coded by an audio decoding circuit 71 , image decoding 
circuit 72, and system data decoding circuit 73 included 
in the decoder 37. 

[0163] Audio data is decoded by the audio decoding 
circuit 71 , and input as stereo audio data (A(L), A(R)) to 
an audio output controller 64 in the display/audio output 
controller 38, where adjustment of the volume level and 
sound field localization, and compatibility to sound mul- 
tiplex broadcasting using first and second sounds are 
executed. After audio to be output is selected, the audio 
data'is transmitted together with image data from the I/ 
F 31 in synchronism with it. 

[01 64] Image data is decoded by the image decoding 
circuit 72 having a plurality of identical decoding units in 
order to decode respective image objects in the image 
data. The decoded image data serves as image data (v 
(1 ) to (n)) corresponding to the number of objects. These 
image data are subjected to various display processing 
and control operations by a display output controller 65 
in the display/audio output controller 38. Display output 
control includes output control of whether to display a 
predetermined object, and control of synthesizing a plu- 
rality of objects and a character-generated image and 
outputting the synthesized image as one output image. ' 
The display-output-controlled image data is transmitted 
from the l/F 31 . 

[0165] System data (containing scene description da- 
ta and additional data) is decoded by the system data 
decoding circuit 73. Time information (clock data) con- 
tained in the additional data in the system data is de- 
tected using a time information detector 66 from the de- 
coded system data. The detected time information is in- 
put to the system controller 39, and can-be used as the 
recording time determination criterion. Of the system 
data decoded by the system data decoding circuit 73, 
scene description data is input to a scene description 
data conversion circuit 68. The remaining system data 
and additional data are input as various commands to 
the system controller 39, and object information is con- 
tained in these data. 

[01 66] An output from the scene description data con- 
version circuit 68 is used to output the basic form of a 
scene in the audio output controller 64 and display out- 
put controller 65, and is also sent to the system control- 
ler 39. 

[01 67] Upon playing back video data, the time detect- 
ed by the time information detector 66 is the past time. 



15 



20 



25 



30 



35 



40 



45 



50 



13 



SDOCID: <EP 1124379A2 I > 



25 



EP 1 124 379 A2 



26 



At this time, non-display processing or object replace- 
ment processing is executed for an object having a pre- 
determined attribute. The system controller 39 deter- 
mines whether time information having a real-time at- 
tribute is contained, from the detection result of the time 
information detector 66 and object information. If a pre- 
determined object having a real-time attribute exists, 
display of the predetermined object is instructed to the 
display output controller 65 so as not to display the pre- 
determined object (non-display processing). 
[0168] The non-display object can also be replaced 
by a newly generated object (character) to display the 
newly generated object. In this case, replacement 
processing is instructed to a replacement processor 69 
in the object replacement processor 41 in addition to 
non-display processing; and replacement processing is 
executed using another object instead of the target ob- 
ject. More specifically, a "time display object" will be ex- 
emplified as a replaceable object having a real-time at- 
tribute. Using original data of a character image held in 
a memory (ROM) 74, a character generator 70 gener- 
ates a time display character image which will replace 
the time display object. Current time information at this 
time is obtained from a calendar (timepiece) function 
unit 67 via the system controller 39, and a time display 
character image representing the obtained current time 
is generated. The replacement processor 69 adds infor- 
mation representing the display position as if the gener- 
ated time display character image representing the cur- 
rent time was the original time display object displayed 
on the original image. Then, the resultant data is input 
to the display output controller 65 where the data is syn- 
thesized into image data and displayed. 
[0169] Non-display processing of an object played 
back by the recording/playback device 12, and object 
replacement processing have been described. 
[0170] Note that non-display processing of a played 
object is not limited to the use of the playback data time 
information detection means. All playback data can be 
recognized as past data. Thus, all predetermined ob- 
jects having real-time attributes in playback can be con- 
trolled not to display them. A case wherein time infor- 
mation is lost owing to any error or data is lost from the 
beginning can be similarly dealt with. 
[0171] Object replacement processing has been ex- 
plained by "time display", but the present invention can 
also be applied to another image object. 
[0172] Needless to say, the original object of a play- 
back image can be directly displayed. According to the 
first embodiment, an image object (display position: X, 
Y) to be displayed can be arbitrarily selected by arbitrar- 
ily switching a switch 201 between the image object (dis- 
play position: X,Y) of a played-back original and a re- 
placement image object formed from a character gen- 
erated by the above-mentioned procedures. By adjust- 
ing position data, the display position of an object can 
be moved. 

[0173] The display device 13 for displaying AV data 



output from the recording/playback device 12 will be ex- 
plained. 

[0174] Fig. 4 is a block diagram showing the detailed 
arrangement of the display device 13 according to the 

5 first embodiment. 

[0175] The display device 13 receives AV data from 
an l/F (interface) 22 via a bus. Of the input AV data, au- 
dio data is output from an audio controller 23 at a timing 
synchronized with display of image data, and converted 

10 into an analog signal by a D/A converter 25. Then, the 
analog signal is output and played back from stereo 
speakers 27. Image data is input to a display controller 
24 where the display timing and display form are adjust- 
ed. After the image data is converted into an analog sig- 

15 nal by a D/A converter 26, the analog signal is displayed 
on a CRT 28. A system controller 29 controls these 
units. An instruction input such as a display adjustment 
instruction from the user is input from an instruction input 
unit 30, and sent to the system controller 29. 

20 [0176] The display device 1 3 of the first embodiment 
has been described. Since the arrangement of the dis- 
play device 1 3 does not influence the characteristic fea- 
tures of the present" invention, the display device 13 is 
not limited to the form shown in Fig. 4, and may be an 

25 LCD (Liquid Crystal Display) or the like. 

[0177] An example of the display form according to 
the first embodiment of the present invention will be ex- 
plained. 

[01 78] Figs. 9 A and 9B are views, respectively, show- 
30 jng an on-air (original) image 101 of a recorded image, 
and an example when a playback image 102 obtained 
by playing back the recorded image undergoes non-dis- 
play processing. 

[0179] As shown in Fig. 9A, the recorded on-air image 

35 101 includes an "time display object (1 0:23)" represent- 
ing the on-air time. In the playback image 102 of Fig. 
9B, this "time display object" is not displayed. 
[0180] Figs. 10A and 10B are views showing an ex- 
ample different from "time display" in Figs. 9 A and 9B, 

40 and are views showing an example when a "weather 
forecast" image object is applied as information having 
another real-time attribute. Similar to Figs. 9A and 9B, 
a "weather forecast" image object 1 07 included in an 
on-air (original) image 105 (Fig. 10A) serving as a re- 

45 corded image is subjected to non-display processing, 
and is not displayed on a playback image 1 06 (Fig. 1 0B) 
obtained by playing back the image. 
[0181] Figs. 11 A and 11 B are views, respectively, 
showing an on-air (original) image 1 03 of a recorded im- 

50 age, and an example when a playback image 104 ob- 
tained by playing back the image undergoes object re- 
placement processing. 

[0182] The on-air image 103 recorded in the past in- 
cludes a "time display object (10:23)" 108 representing 
55 the on-air time. In the current playback image 1 04, a 
"time display object (7:45)" 109 generated by a charac- 
ter representing the current image playback time is dis- 
played in place of the "time display object". 
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[0183] An operation sequence for image object non- 
display processing and image object replacement 
processing in the first embodiment of the present inven- 
tion will be explained with reference to the flow chart of 
Fig. 12. 

[0184] In step S01 , the recording/playback device 12 
shown in Figs. 3 and 5 that can record/play back MPEG4 
video data plays back video data from the recording me- 
dium 33 along with playback operation. The played vid- 
eo data is decoded in step S02, and pieces of object 
information of image objects constituting the video data 
are analyzed to check their attributes in step S03. 
[0185] It is checked whether an image object having 
a code representing a real-time attribute exists as a re- 
sult of analysis based on various codes represented by 
the pieces of object information (step S04). If YES in 
step S04, the flow shifts to step SOS to determine wheth- 
er the image object having a real-time attribute is not 
displayed. If NO in step S04, and the image object hav- 
ing a real-time attribute is determined in step SOS to be 
displayed (NO in step SOS), the flow shifts to step S06 
to display and output the playback image as the are. 
[0186] If YES in step SOS, the flow advances to step 
S07 to execute object non-display processing by the 
above-described method. After non-display processing 
is executed, it is checked in step SOS whether to create 
a new image object (character) by the object replace- 
ment processor 41 on the basis of current information 
corresponding to the real-time image object, and re- 
place the image object having the real-time attribute by 
the new image object. If NO in step S08, the flow ad- 
vances to step S09 to display and output the playback 
image in a display form in which only the image object 
having the real-time attribute is not displayed. 
[0187] If YES in step SOS, the flow shifts to step S1 0 
to execute object replacement processing by the above- 
described method. In object replacement processing, a 
character image (e.g., current time) is newly generated 
based on the current time, and synthesized as an image 
object with another playback image data (object). The 
resultant image is displayed and output (step S11). 
[01 88] In the first embodiment, the display form in vid- 
eo playback is controlled by analyzing object informa- 
tion. The present invention can be easily applied to an 
image object other than the above-mentioned image ob- 
ject having a real-time attribute. 

[01 89] According to the first embodiment, the appara- 
tus and system having the above arrangements can re- 
alize more user-friendly video playback/display with a 
higher visual effect. More specifically, in playing back 
video data of a recorded television program, the appa- 
ratus and system can control not to display the on-air 
video display time different from the current time, or can 
replace the on-air video display time by time information 
of the video playback time and display the resultant da- 
ta. This can prevent viewer's confusion in advance. 
[0190] According to the first embodiment, playback 
output of a predetermined object can be controlled. As 



another effect, the number of dubbing operations can 
be limited for only a predetermined object, which is also 
effective in terms of copyrights. 

5 [Second Embodiment] 

[0191] The second embodiment of the present inven- 
tion will be described. In the second embodiment, a dis- 
play device comprises the non-display processing func- 
10 tion for a predetermined object and the object replace- 
ment processing function that have been described in 
the first embodiment. 

[0192] Fig. 13 is a block diagram showing the ar- 
rangement of a recording/playback device 71 for record- 

*s ing and playing back MPEG4 video data according to 
the second embodiment of the present invention. The 
same reference numerals as in the arrangement of Fig. 
3 denote the same parts, and a description thereof will 
be omitted. The recording/playback device 71 does not 

20 comprise the object replacement processor 41 in the re- 
cording/playback device 12 described with reference to 
Fig. 3. 

[0193] The recording/playback device 71 outputs, 
from an l/F (interface) 31 to an external device via a bus, 

25 AV data obtained by decoding MPEG4 video data in 
playback, and sub-data containing object information 
detected in decoding and (on-air) time information. 
[0194] Fig. 14 is a block diagram showing the ar- 
rangement of a display device 72 coping with display of : 

30 an MPEG4 object image according to the second em- 
bodiment. The same reference numerals as in the ar- 
rangement of Fig. 4 denote the same parts, and a de- 
scription thereof will be omitted. 

[0195] Th e display device 72 receives .from an l/F (in- 

35 terface) 22, AV data and sub-data that are output and 
transmitted from the recording/playback device 71 in • 
Fig. 3. From the sub-data, time information accessory 
to the AV data is detected by a time information detector 
51 , whereas object information is detected by an object 

40 information detector 52. 

[0196] A system controller 29 determines the data re- 
cording time from the time information of the input AV 
data detected by the time information detector 51 , com- 
pares the determined time with the current time from a 
calendar function unit 56, and if the two times are differ- 
ent, executes object non-display processing/replace- 
ment processing for an object having a predetermined 
attribute. At this time, the system controller 29 deter- 
mines an object from the object information detected by 

so the object information detector 52, If an object having a 
predetermined real-time attribute exists, the system 
controller 29 instructs a display output controller 24 ca- 
pable of controlling display for each object to perform 
non-display processing so as not to display the prede- 

55 termined object. Alternatively, the non-display object 
can be replaced by a newly generated object (charac- 
ter). In this case, in addition to non-display processing, 
the system controller 29 instructs an object replacement 
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processor- 53 to execute replacement display process- 
ing using another object instead of the target object 
[0197] More specifically, a "time display object" will be 
exemplified as a replaceable object. Using a character 
image held in a memory (ROM) 55, a character gener- 
ator 54 generates a time display character image which 
will replace the target object. Current time information 
at this time is obtained from the calendar (timepiece) 
function unit 56 via the system controller 29, and a time 
display character image representing the obtained cur- 
rent time is generated. The object replacement proces- 
sor 53 designates the display position or the like, and 
inputs the generated time display character image to the 
display output controller 24 so as to replace the original 
time display object by the generated time display char- 
acter image. Then, the display output controller 24 syn- 
thesizes and displays the input image data. 
[0198] Non-display processing of a predetermined 
object in image data of input AV data, and object re- 
placement processing have been described. 
[0199] An input image having undergone non-display 
processing or both non-display processing and object 
replacement processing in the display device according 
to the second embodiment is displayed in a form similar 
to Figs. 9A and 9B, 10A and 10B, or 11A and 11B, as 
described in the first embodiment. 
[0200] Object replacement processing has been ex- 
plained by "time display", but the present invention can 
also be applied to another image object. An object in- 
cluded in a recorded image can be directly displayed. 
[0201] Also in the second embodiment, an image ob- 
ject (display position: X, Y) to be displayed can be arbi- 
trarily selected by arbitrarily switching a switch 201 be- 
tween the image object (display position: X,Y) of an in- 
put original and a replacement image object formed 
from a character generated by the above-mentioned 
procedures. By adjusting position data of, a display im- 
age, the display position of an object can be moved. 
[0202] An operation sequence for image object non- 
display processing and image object replacement 
processing in the display device 72 according to the sec- 
ond embodiment of the present invention will be ex- 
plained with reference to the flow chart of Fig. 15. 
[0203] Upon reception of AV data and accessory sub- 
data (step S21), the display device 72 shown in Fig. 14 
that can display MPEG4 AV data detects and analyzes 
time information in the sub-data (step S22), and ac- 
quires the time information representing the recording 
time of the AV data. Further, the display device 72 ana- 
lyzes object information for image objects constituting 
the input image data, and checks their attributes (step 
S23). In step S24, it is checked whether an image object 
having a code representing a real-time attribute exists 
as a result of analysis based on various codes repre- 
sented by the pieces of object information. If YES in step 
S24, the flow shifts to step S25 to determine whether 
the image object having a real-time attribute is not dis- 
played. 



[0204] If NO in step S24, and the image object having 
a real-time attribute is determined in step S25 to be dis- 
played (NO in step S25), the flow shifts to step S26 to 
display and output the input image as they are. 

5 [0205] If YES in step S25, the flow advances to step 
S27 to execute object non-display processing by the 
above-described method. After non-display processing 
is executed in step S27, it is also possible to create a 
new image object (character) based on current informa- 

10 tion corresponding to the real-time image object by the 
object replacement processor 53 and character gener- 
ator 54, and replace the real-time image object by the 
new image object. 

[0206] in this case, after object non-display process- 
's ing is executed in step S27, the flow shifts to step S28 
to check whether to execute object replacement 
processing. If NO in step S28, the flow advances to step 
S29 to display and output the input image in a display 
form in which only the real-time image object is not dis- 
20 played. 

[0207] If YES in step S28, the flow shifts to step S30 
to execute object replacement processing by the above- 
described method. In object replacement processing in 
step S30, a character image is newly generated based 

25 on current information, synthesized as an image object 
with another input image data (object). The resultant im- 
age is displayed and output (step S31). 
[0208] In the second embodiment of the present in- 
vention, the display form in video display is controlled 

30 by analyzing object information. The present invention 
can be easily applied to an image object other than the 
above-mentioned image object having a real-time at- 
tribute. 

[0209] The display device according to the second 
35 embodiment realizes more user-friendly video display 
with a higher visual effect. More specifically, the display 
device which receives video data can control an input 
image having a time display object different from the cur- 
rent time so as not to display the time display object, or 
40 can replace the time display object by time information 
of the playback time and display the resultant image. 
This can prevent viewer's confusion in advance. 

[Third Embodiment] 

45 

[0210] In the third embodiment, the attribute of each 
object is determined from a genre code 61 , object code 
62, and broadcasting station code 63 contained in object 
information 59, and an object having a predetermined 
50 attribute is subjected to processing of changing the ob- 
ject to another object in playback. As the predetermined 
attribute, an emergency news prompt report image (te- 
lop) object will be exemplified. 

[0211] Fig. 34 is a view for explaining each genre code 
55 61 and a corresponding object code 62 for each broad- 
casting station. An image of the arrangement of each 
object information is illustrated, and a code arrangement 
corresponding to each broadcasting station is exempli- 
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fied. 

[0212] The genre code 61 is a code representing a 
program genre such as "news", "professional baseball", 
or "movie". For "news", the object code 62 is a "back- 
ground image", "person's image ", "weather forecast im- s 
age", or the like. For "professional baseball", the object 
code 62 is a "background image", "count display image", 
"player image", or the like. For the genre "movie", the 
object code 62 is a "background", another image, or 
"emergency news image" 64 such as an earthquake 10 
prompt report. A combination of the genre code 61 and 
object codes 62 exists for each broadcasting station 
code 63 representing each broadcasting station. 
[021 3] Each broadcasting station provides a user with 
codes for identifying objects as pieces of object infor- '5 
mation by using a code common to respective stations 
or unique to a station. A device on the broadcasting sta- 
tion side and a device on the user side are set to under- 
stand the same code. 

[0214] Fig. 35 is a block diagram showing the ar- 20 
rangement of a portion relating to object playback/object 
form change processing in the arrangement of a record- 
ing/playback device 12 according to the third embodi- 
ment of the present invention. The same reference nu- 
merals as in Fig. 3 denote the same parts in Fig. 35, and 25 
a description thereof will be omitted. 
[0215] In Fig. 35, video data which was played back 
from a recording medium 33 and processed by a play- 
back processing circuit 35 is demultiplexed by a multi- 
plexed-data demultiplexing circuit 36. Data are respec- 30 
tively decoded by an audio decoding circuit 71 , image 
decoding circuit 72, and system data decoding circuit 
73 included in a decoder 37. 

[0216] Audio data is decoded by the audio decoding 
circuit 71 , and sent as stereo audio data (A(L), A(R)) to 35 
an audio output controller 64 in a display/audio output 
controller 38, where adjustment of the volume level and 
sound field localization, compatibility to sound multiplex 
broadcasting using first and second sounds, and 
change and addition of an audio object are executed. 40 
[021 7] Image data is decoded by the image decoding 
circuit 72 having a plurality of identical decoding units in 
order to decode respective image objects in the image 
data. Image data corresponding to the objects decoded 
by the image decoding circuit 72 serve as image data 45 
(v(1) to v(n)) corresponding to the number of objects. 
These image data are sent to a display output controller 
65 of the display/audio output controller 38, and subject- 
ed to various display processing and control operations. 
[021 8] An object having a predetermined attribute un- so 
dergoes processing of changing the playback (repro- 
ducing) form by the respective units of the display/audio 
output controller 38. As a playback form change exam- 
ple, when an image object such as an emergency news 
telop as a predetermined object attribute is played back, 55 
additional processing is executed for the object by any 
one of following (A) to (D). 



(A) The image object is replaced by an icon object 
using an internally generated character image. 

(B) The image object of the original emergency 
news is played back. 

(In addition to (B), recording time information is add- 
ed and displayed.) • 

(C) Change of the playback form is indicated by a 
warning sound using an audio object. 

(D) No display is performed. 

[0219] Alternatively, a playback form can be freely se- 
lected. 

[0220] The piaybacked image and audio objects in- 
cluding the object whose playback form was changed in 
accordance with necessity are mixed and transmitted 
as AV data via an l/F (interface) 31 . 
[0221 ] System data (containing scene description da- 
ta and additional data) is decoded by the system data 
decoding circuit 73 of the decoder 37. Time information 
necessary for determining the time is detected by a time 
information detector 66 from the decoded system data. 
More specifically, the time is detected from time infor- 
mation (clock data) contained in additional data of the 
system data in decoding. The detected time information 
is input to a system controller 39, and can be used as 
the recording time determination criterion. 
[0222] Of the system data decoded by the system da- * 
ta decoding circuit 73, scene description data is input to J> 
a scene description data conversion circuit 68. The re- 
maining system data and additional data are input as 
various commands to the system controller 39, and ob- 
ject information is contained in these data. 
[0223] An output from the scene description data con- 
version circuit 68 is supplied to the audio output control- : 
ler 64 and display output controller 65 where the output ' 
is used to output the basic form of a scene, and is also 
sent to the system controller 39. 
[0224] An object generator/controller 400 is constitut- 
ed by an object controller 710 for issuing a playback 
form change instruction for an object having a predeter- 
mined attribute, a sound source 740 serving as a means 
for generating an audio object, a character generator 70 
for generating an image object such as an icon, and a 
memory (ROM) 74 for holding original data. The object 
controller 710 identifies an object having a predeter- 
mined attribute upon reception of an instruction from the 
system controller 39, and changes and controls the dis- 
play form in accordance with the set contents. In addi- 
tion, the object controller 71 0 controls to insert generat- 
ed image and audio objects to playback data, and ad- 
justs the playback form change timing. 
[0225] As the sequence, the object controller 710 
identifies an object code corresponding to "emergency 
news" or the like based on attribute information, on the 
basis of object information transmitted from the system 
controller 39. The object controller 71 0 sends an image 
playback form change instruction for the object to the 
respective units of the display/audio output controller 38 
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in accordance with the identification. At this time, an icon 
image object used in changing the playback form is gen- 
erated by the character generator 70 using original data 
stored in the memory (ROM) 74. This icon image object 
is sent to the display output controller 65. An audio ob- 
ject used for a warning sound is obtained by sending an 
audio object generated by the sound source 740 to the 
audio output controller 64. 

[0226] Processing of not displaying a played-back ob- 
ject is realized by control of not displaying only this ob- 
ject under the control of the object controller 71 0. 
[0227] As one of functions according to the third em- 
bodiment of the present invention, when, e.g., an "emer- 
gency news" image as the display form of an original is 
to be-played back, time display of the occurrence time 
can be synthesized to the playback image and dis- 
played. In this case, a time display object is generated 
by the character generator 70 on the basis of recording 
time information obtained from an output from the de- 
coder 37 by using original data stored in the memory 74, 
and inserted to playback data and synthesized as one 
image object. 

[0228] Fig. 36 is a view for explaining the function of 
a selection means for changing the playback form of im- 
age data having a predetermined attribute in the object 
controller 710. 

[0229] In Fig. 36, reference numeral 2010 denotes a 
selector (switch) which can select whether to directly 
display, e.g., "emergency news", to switch it to another 
icon and display the icon, or not to display "emergency 
news". A switch 2020 determines whether to add, e.g., 
a warning sound to "emergency news". A synthesizer 
2030 synthesizes a picture selected by the selector 
2010 and a warning sound input via the switch 2020. 
The functions of these units are executed by the display 
output controller 65 and audio output controller 64 in Fig. 
35. A switching instruction to the selector 2010 and in- 
sertion of a generated object are controlled by the object 
controller 710. 

[0230] In playback, when the object controller 710 
identifies an object code corresponding to "emergency 
news" in playback data, it sends a playback form change 
instruction for the object to the respective units of the 
display/audio output controller 38. The playback pattern 
of the object at this time can be selected from three pat- 
terns: (A) an image object "icon" generated in the appa- 
ratus is displayed at a predetermined display position 
(X0,YO) at the upper left corner of the screen, (B) an 
image object (emergency information telop) of an origi- 
nal is played back and displayed at the original display 
position (X,Y), and (C) only the object is not displayed. 
[0231] When the image object of an original is to be 
displayed as it is, an image object representing the re- 
cording time can be generated by the character gener- 
ator 70 based on current time information, superposed 
on the image, and additionally displayed. 
[0232] As initial settings, the object controller 710 is 
desirably set to, when an object having a predetermined 



attribute is detected, display the object as an icon. The 
object controller 71 0 is more desirably constituted to ar- 
bitrarily select settings from the above-mentioned three 
patterns by operating the selector 201 0 at a given timing 

5 in accordance with user tastes. 

[0233] Moreover, an audio object as an effective 
warning sound can be added to warn the user that an 
image object was changed. In this case, an audio object 
generated by the sound source 740 can be synthesized 

10 to output AV data via the synthesizer 2030 by turning on 
the switch 2020. 

[0234] An operation according to the third embodi- 
ment of the present invention, and examples of the dis- 
play form will be explained with reference to Figs. 37A 

15 to 37D and 38A to 38C. 

[0235] Fig. 37A shows one frame of the on-air image 
of an animation which is an original television image re- 
corded in the recording/playback device 12. An image 
object 1 1 00 of a telop representing eruption of a volcano 

20 is additionally displayed as an example of "emergency 
news" on an on-air image 1101 of the animation. 
[0236] Fig. 37B shows an example of a playback im- 
age according to the third embodiment. As (A), the 
emergency news is displayed as an icon 11 03 in playing 

25 back the recorded image of the animation shown in Fig. 
37A. On a playback image 1 102, the icon (image object) 
1103 is displayed instead of the image object 1100 of 
the "emergency news" telop whose playback form is to 
be changed. 

30 [0237] Fig. 37C is a view showing an operation of des- 
ignating the icon 1 1 03 displayed on the playback image 
1102 with a mouse cursor 1104 during playback of the 
animation in the display form of Fig. 37B, and issuing 
an instruction of displaying the detailed contents of the 

35 icon (contents of the emergency news). The mouse cur- 
sor 1104 is operated with an instruction input means 
such as a mouse, and the icon 1 1 03 is clicked to execute 
the instruction. 

[0238] Fig. 37D shows an example when the playback 
40 form of the playback pattern is changed back to display 
of the original image in response to the instruction is- 
sued in Fig. 37C, as described in (B). In this case, the 
image object 1100 of the "emergency news" original is 
played back and displayed on the playback image 11 02. 
45 As additional display information, a time display object 
1110 representing the recording time of the original im- 
age can also be displayed. 

[0239] In this manner, the playback pattern (A) in 
which an image object having a predetermined attribute 
so is displayed as an icon, and the playback pattern (B) in 
which the image object is displayed without changing 
the original can be changed and instructed with a simple 
operation. 

[0240] Fig. 38A is a view showing an example of a 
55 television image in which an image object 1 1 06 of a telop 
representing an earthquake as "emergency news" is 
displayed in an on-air image 1105 during broadcasting 
of a movie program guide, and directly recorded in the 
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recording/playback device 1 2. 

[0241 ] Fig. 38B is a view showing an example in which 
an icon 1108 is played back and displayed instead of 
the image object 1 1 06 of the "emergency news" telop in 
correspondence with the above playback pattern (A) in 
a playback image 11 07 in playing back the recorded im- 
age in Fig. 38A. In this case, an audio object is inserted 
as the playback pattern (C) in a scene in which the dis- 
play form of the "emergency news" is changed to the 
icon 1108, and a warning sound (sound) 1109 is output 
to warn the user. 

[0242] Fig. 38C is a view showing an example when 
the recorded image is played back without displaying 
either the icon or the image object of the "emergency 
news" telop, as the playback pattern (D). 
[0243] In this fashion, the playback pattern (C) in 
which a warning sound is output by adding an audio ob- 
ject, and the playback pattern (D) in which an image ob- 
ject whose playback form is to be changed is not dis- 
played completely can be freely set, similar to the play- 
back patterns (A) and (B). 

[0244] Fig. 39 is a flow chartfor explaining image play- 
back processing in the recording/playback device 1 2 ac- 
cording to the third embodiment of the present invention. 
[0245] If playback operation for a recorded image is 
instructed in the playback mode of the recording/play- 
back device 12 (step S101), video data played back 
from the recording medium 33 is decoded by the decod- 
er 37 in step S102. In step S103, pieces of object infor- 
mation are analyzed for image objects constituting the 
video data, and their attributes are checked based on 
various object codes. In step S104, it is determined 
based on the results of analysis whether an image ob- 
ject having an attribute formed from an "emergency 
news" code exists among the image objects. This 
means a case wherein the "emergency news" attribute 
is set in advance for a target object whose playback form 
is to be changed. 

[0246] If NO in step S104, the flow advances to step 
S1 1 0 to directly output playback data as AV data. 
[0247] If YES in step S1 04, the flow advances to step 
S1 05 to read out a set value used in changing the play- 
back form (value specifying the playback pattern) . If the 
set value is "1", the flow shifts to step S106, the output 
of the "emergency news" image object is changed to a 
predetermined icon, and the icon is displayed (playback 
pattern (A)) ( as shown in Fig. 37B. If the set value is "0" 
in step S1 05, the flow shifts to step S1 07, the "emergen- 
cy news" image object is played back as it is without 
changing the original, as shown in Fig. 37D, and a "time 
display" image object representing the on-air time is 
added and displayed in step S108 (playback pattern 
(B)). If the set value is "2" in step S105, the flow shifts 
to step S109 to inhibit display of the "emergency news" 
image object. At this time, even any icon is not displayed 
(playback pattern (D)). 

[0248] By setting any one of the three set values, the 
"emergency news" image object is changed in its play- 



back form, and displayed and output as Av* data togeth- 
er with another played-back data (step S110). 
[0249] In step S11 1 , it is checked whether change of 
the set value is instructed with the mouse cursor 1104, 

5 as shown in Fig. 37C. If YES in step S111 , the flow shifts 
to step S 1 1 2 to change the set value to a newly set value 
in accordance with the setting instruction. Note that the 
set value may be input from, e.g., the instruction input 
unit 40 of the recording/playback device 12. In this way, 

10 the display/playback form of an image object having a 
predetermined attribute can be easily changed. After the 
set value is changed, whether an "emergency news" im- 
age object exists is determined again in step S1 04. If 
YES h step S104, the flow advances to step S105 to 

'5 chancie the playback form in accordance with the newly 
set value. 

[0250] If NO in step S111, the operation from step 
S101 is repetitively executed until the playback mode 
ends in accordance with a user instruction or system 

20 factor. If playback ends in step S113, the operation 
stops, and playback processing ends. 
[0251] In the third embodiment, an object having a 
predetermined attribute has been described as a play- 
back form change target by exemplifying an "emergency 

25 news (telop)" image object. The present invention is not 
limited to this, and can be applied to all objects such as 
various telops including a "prompt report of election re- 
turns" and "weather forecast image", or an image or au- ' 
dio such as a "subtitle of a movie" for which the user ■ 

30 wants to change the playback form. 

[0252] The output destination of image and audio data 
from the recording/playback device 1 2 according to the 
third embodiment is not limited to the display device 13, 
and can be another recording/playback device. In other ' 

35 words, the present invention can be applied to dubbing. ' 
[0253] As described above, according to the third em- 
bodiment, the playback form of only an image object 
having a predetermined attribute can be changed in 
playing back a recorded image. The user can delete an 

40 unwanted image, or can display another image data. 
The third embodiment can provide a more user-friendly 
video playback function with a higher visual effect. 

[Fourth Embodiment] 

45 

[0254] The fourth embodiment of the present inven- 
tion will be described. The fourth embodiment will ex- 
plain a display device 75 having a playback form change 
function. 

50 [0255] Fig. 40 is a block diagram showing the ar- 
rangement of the display device 75 for recording and 
playing back MPEG4 video data according to the fourth 
embodiment of the present invention. The same refer- 
ence numerals as in the arrangement of Fig. 3 denote 

55 the same parts in Fig. 40, and a description thereof will 
be omitted. 

[0256] The recording/playback device 75 outputs, 
from an l/F (interface) 31 to an external device via a bus, 
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AV data obtained by decoding MPEG4 video data in 
playback, and sub-data containing object information 
detected in decoding. 

[0257] Fig. 41 is a block diagram showing the ar- 
rangement of a display device 76 which corresponds to 
the display device 75 according to the fourth embodi- 
ment, and copes with playback of an MPEG4 object im- 
age. The display device 76 can receive and display tel- 
evision data from the display device 75 in Fig. 40 or the 
television broadcasting reception device 11 in Fig. 2. 
[0258] As the additional function of the display device 
13 shown in Fig. 4, the display device 76 comprises a 
function of defining a display form and audio output form 
as a "playback form", and changing the playback form 
for a predetermined object. In Fig. 41 , the same refer- 
ence numerals as in Fig. 4 denote the same parts, and 
a description thereof will be omitted. 
[0259] The display device 76 receives via an l/F (in- 
terface) 22 AV data and sub-data output from the display 
device 75 in Fig. 40 or the television broadcasting re- 
ception device 11. From the sub-data, time information 
accessory to the AV data is detected by a time informa- 
tion detector 177, whereas object information is detect- 
ed by an object information detector 178. Input audio 
data is processed by an audio controller 23, transmitted 
to a D/A converter 25, and played back. Image data is 
processed, displayed, and controlled by a display con- 
troller 24. 

[0260] Of objects constituting the input image or audio 
data, an object having a predetermined attribute under- 
goes processing of changing the playback form by the 
respective units of the display controller 24 and audio 
output controller 23. As a playback form change exam- 
ple, if an image object having a predetermined object 
attribute is received, additional processing is executed 
for the object by any one of following (a) to (d). 

(a) The image object is replaced by an "icon" object 
using an internally generated character image. 

(b) The object of the original is played back. 

(In addition to (b), a "time display" image object is 
added and displayed on the basis of time informa- 
tion accessory to the data.) 

(c) Change of the playback form is indicated by a 
warning sound using an audio object. 

(d) No display is performed. 

[0261 ] Alternatively, a playback form can be freely se- 
lected. 

[0262] Time information necessary to determine the 
time is detected using the time information detector 1 77 
from time information contained in sub-data. The detect- 
ed time information is input to a system controller 131 , 
and used to generate a "time display" image object. 
[0263] An object controller 179 issues a playback (i. 
e., display and/or audio output) form change instruction 
for an object set in advance as a playback form change 
target. The object controller 1 79 comprises and controls 



a sound source 182 serving as a means for generating 
an audio object, a character generator 1 80 for generat- 
ing an image object such as an icon, and a memory 
(ROM) 181 for holding original data. 
5 [0264] The object controller 179 identifies the at- 
tribute of an object from object information, controls its 
display, controls to insert generated image and audio 
objects in playback data, and adjusts the change timing 
of the playback form. As the sequence, when the object 
10 controller 1 79 identifies an object code having a prede- 
termined attribute whose playback form is to be 
changed, on the basis of object information which is de- 
tected by the object information detector 1 78 and trans- 
mitted from the system controller 131 , the object con- 
15 troller 1 79 sends a playback form change instruction for 
the object to the respective units of the audio controller 
23 and display controller 24. 

[0265] As an icon image object used in changing the 
playback form, an icon generated by the character gen- 
erator 1 80 using original data in the memory (ROM) 181 
is sent to the display controller 24. As an audio object 
used for a warning sound, an audio object generated by 
the sound source 1 82 is sent to the audio controller 23. 
[0266] Non-display processing of a played-back ob- 
ject is done by controlling not to display only the object. 
[0267] Control of the playback form is the same as 
that described in the third embodiment with reference to 
Figs. 38A to 38C. 

[0268] When time display is to be synthesized with a 
playback image and displayed, a time display object is 
generated by the character generator 1 80 using original 
data in the memory (ROM) 181 on the basis of time in- 
formation obtained from an output from the time infor- 
mation detector 177, and inserted and synthesized as 
one image object with playback data. 
[0269] Image data made up of image objects includ- 
ing an object whose playback form was changed as 
needed is output to and displayed on a CRT 28. Audio 
data made up of audio objects is output from a loud- 
speaker 27. 

[0270] An input image having undergone non-display 
processing/object replacement processing in the dis- 
play device can be freely displayed as shown in Figs. 
37A to 37D and 38A to 38C, as described in the above 
embodiments. 

[0271] Control of the playback form corresponding to 
the display mode in the display device 76 according to 
the fourth embodiment of the present invention will be 
explained with reference to the flow chart of Fig. 42, 
[0272] The display device 76 receives input AV data 
in the display mode in step S201 , and decodes it in step 
S202. In step S203, pieces of object information are an- 
alyzed for image objects constituting image data, and 
their attributes are checked based on various object 
codes. In step S204, it is determined whether an image 
object having a predetermined attribute such as "emer- 
gency news" or "subtitle of a movie" exists in the results 
of analysis. 
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[0273] If NO in step S204, the flow advances to step 
S209 to directly output the image data in the original 
form. If YES in step S204, the flow advances to step 
S205 to read out a set value used in changing the play- 
back form. 

[0274] If the set value is "1", the flow shifts to step 
S206, the image object is changed to a generated icon, 
and the icon is output (playback pattern (a)). If the set 
value is "0", the flow shifts to step S207, and the image 
object is output without changing the original (playback 
pattern (b)). If the set value is "2" in step S205, the flow 
shifts to step S208 to inhibit output of the image object. 
At this time, even any icon is not displayed (playback 
pattern (d)). 

[0275] With the three set values, the image object is 
changed in its playback form, and displayed and output 
together with another display data and/or audio data 
(step S209). 

[0276] This set value can be arbitrarily changed. In 
step S210, similar to step St 11, whether a set value 
change instruction is input is checked. If YES in step 
S21 0, the flow proceeds to step S21 1 , and the set value 
can be easily changed by setting a new set value input 
from an instruction input unit 30. After the set value is 
changed, if an image object whose playback form is to 
be changed is determined in step S204, the flow shifts 
to step S205 to change the display form of the image 
object having the designated attribute on the basis of 
the latest changed set value. 

[0277] If NO in step S210, the flow advances to step 
S212 to repetitively execute the operation from step 
S201 as far as the display mode continues. This oper- 
ation is executed until the playback mode ends in step 
S212. 

[0278] As described above, according to the fourth 
embodiment, the display form and audio output form in 
the display device 76 are defined as a playback form. 
Change of the playback form is set for an image object 
such as "emergency news" or "movie subtitle" , and can 
be easily realized. 

[0279] In the fourth embodiment, an image object 
such as "emergency news" or "movie subtitle" has been 
exemplified. The present invention is not limited to this, 
and can be applied to all objects such as an image or 
audio to which the user wants to apply change of the 
playback form. 

[0280] Hence, only an object determined to be unnec- 
essary can be hidden in displaying an image, which en- 
ables more user-friendly display with a higher visual ef- 
fect. 

<Other Embodiment 

[0281] As other embodiment, an embodiment when 
MPEG4 coding type video data (television data) as the 
premise of the above-described embodiments is imple- 
mented after being assembled in part of MPEG2 coding 
type video data (television data) will be described. 



[0282] Fig. 16 is a view showing the structure of an 
MPEG2 transport stream as the transmission format of 
an MPEG2 data stream used in MPEG2 coding type dig- 
ital television broadcasting. The structure in Fig. 16 will 

5 be explained. 

[0283] The MPEG2 transport stream is multiplexed/ 
demultiplexed by a fixed-length transport packet. The 
data structure of the transport packet is hierarchically 
expressed as shown in Fig. 16, and includes items 

10 shown in Fig. 16. 

[0284] The transport packet sequentially contains an 
8-bit sync signal (sync), error display (error indicator) 
indicting the presence/absence of a bit error in the pack- 
et, unit start display representing the start of a new unit 

is from the payload of the packet, priority (packet priority) 
representing the degree of significance of the packet, 
identification information PID (Packet Identification Da- 
ta) representing the attribute of an individual stream, 
scramble control representing the presence/absence 

20 and type of scramble, adaptation field control represent- 
ing the presence/absence of the adaptation field of the 
packet and the presence/absence of the payload, a cy- 
clic counter serving as information for detecting whether 
a packet having the same PID was partially rejected dur- 

25 jng operation, an adaptation field capable of optionally" 
containing additional information or a stuffing byte, and 
a payload (information). 

[0285] The adaptation field contains a field length, 
various items about another individual stream, an op- 

30 tional field, and a stuffing byte (invalid data byte) . In this 
embodiment, an MPEG4 bitstream is multiplexed as 
one of additional data in this field. The transport packet 
of MPEG2 television broadcasting has this structure. * 
[0286] Non-display processing of a predetermined 

35 object and object replacement processing according to- 
the embodiment are realized in consideration of a case - 
wherein a desired image object and system data such 
as time information or object information are assembled 
in an MP EG 4 bitstream multiplexed as additional data 

40 in MPEG2 system data in MPEG2 television broadcast- 
ing using the above-described transport stream. 
[0287] At this time, as shown in Fig. 1 6, image objects 
(objects A, B, and C in Fig. 16) formed from small data 
amounts of CGs (time display image, weather forecast 

45 image, and the like), scene description information 
(BIFS) of each object, and system data such as time 
information and object information for identifying an im- 
age object are multiplexed and transmitted as an 
MPEG4 bitstream in a predetermined area of the adap- 

50 tation field in MPEG2 system data. An ID representing 
the presence of the MPEG4 data is added before (or 
before and after) the area where MPEG4 data is multi- 
plexed. This ID is used to identify data. 
[0288] Image data such as a CG assembled in part of 

55 MPEG2 data can undergo object no n -display process- 
ing or object replacement processing, like M PEG4 video 
data described in the first and second embodiments. 
[0289] In this case, if an ID representing the presence 
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of MPEG4 data can be identified from the MPEG2 bit- 
stream, and MPEG4 data can be individually extracted, 
image objects, object information, and time information 
can be respectively extracted from the MPEG4 data. 
Non-display processing of a predetermined image ob- 
ject in accordance with necessity, or display control by 
replacement processing can be easily achieved with the 
arrangement of the first or second embodiment. The 
method and operation are the same as the above-de- 
scribed ones. 

[0290] With this arrangement, the present invention 
can be applied to not only an MPEG4 television program 
but also an MPEG2 television program or video data 
containing MPEG4 data. 

[0291] MPEG2 data and MPEG4 data can share 
many coding/decoding circuits. Thus, the circuit ar- 
rangement can be effectively implemented without any 
complicated arrangement. Even for a software decoder, 
the system can be made efficient. 
[0292] With this arrangement, the present invention is 
easily adapted to a system in which an MPEG4 object 
is multiplexed in an MPEG2 television program because 
a real-time image object such as time display or weather 
forecast to which non-display processing or replace- 
ment processing of the embodiment is applied is often 
small data. 

[0293] -The present invention may be applied to a 
system constituted by a plurality of devices (e.g., a host 
computer, interface device, reader, and printer) oran ap- 
paratus comprising a single device (e.g., a copying ma- 
chine or facsimile apparatus). 

[0294] The present invention can be realized even 
even by supplying a storage medium (or recording me- 
dium) which stores software program codes for realizing 
the functions of the above-described embodiments to a 
system or apparatus, and causing the computer (or a 
CPU or MPU) of the system or apparatus to read out 
and execute the program codes stored in the storage 
medium. In this case, the program codes read out from 
the storage medium realize the functions of the above- 
described embodiments by themselves, and the storage 
medium which stores the program codes constitutes the 
present invention. The functions of the above-described 
embodiments are realized not only when the computer 
executes the readout program codes, but also when the 
operating system (OS) running on the computer per- 
forms part or all of actual processing on the basis of the 
instructions of the program codes. 
[0295] The functions of the above-described embod- 
iments are also realized when the program codes read 
out from the storage medium are written in the memory 
of afunction expansion board inserted into the computer 
or that of a function expansion unit connected to the 
computer, and the CPU of the function expansion board 
or function expansion unit performs part or all of actual 
processing on the basis of the instructions of the pro- 
gram codes. 

[0296] As has been described above, according to the 



present invention, an object having an attribute which 
was significant in recording (past) but is insignificant in 
playback, such as an object having a real-time attribute 
significant in recorded digital data, can be inhibited from 
5 being displayed, or can be changed in the display form 
in correspondence with the playback time. This is effec- 
tive in adding a new function for playback of a television 
program. 

[0297] According to the embodiments, the apparatus 
10 and system having the above arrangement enable more 
user-friendly video playback/display with a higher visual 
effect, and can improve the quality of the user interface. 
[0298] According to the embodiments, playback out- 
put of an object having a predetermined attribute can be 
15 controlled. As another effect, the number of dubbing op- 
erations can be limited for only a predetermined object, 
which is also effective in terms of copyrights. 
[0299] According to the embodiments, an MPEG4 bit- 
stream can be assembled in an MPEG2 coding type tel- 
20 evision broadcasting system, and an existing system 
can be utilized. 

[0300] According to the embodiments, digital televi- 
sion broadcasting can be easily combined with a per- 
sonal computer (PC). Layout settings performed on a 
25 pc desktop at present can be customized even for a 
television image, so that television broadcasting and the 
PC are highly compatible. The effect of expanding the 
market is expected in the field of digital composite prod- 
ucts. 

30 [0301] The above embodiments have exemplified a 
reception/playback system constituted by a reception 
device, recording/playback device, and display device. 
The present invention is not limited to this, and can also 
be applied to a device such as a television receiver hav- 
35 ing a recording function that is constituted by integrating 
devices. 

[0302] As many apparently widely different embodi- 
ments of the present invention can be made without de- 
parting from the spirit and scope thereof, it is to be un- 
40 derstood that the invention is not limited to the specific 
embodiments thereof except as defined in the append- 
ed claims. 



Claims 

1. An image processing apparatus for reproducing a 
recorded digital data stream, characterised by com- 
prising: 

determination means (66, S04, S24) for deter- 
mining whether an object having a predeter- 
mined attribute exists in the recorded digital da- 
ta stream; and 

reproducing means (S07-S11, S27-S31) for 
changing a reproducing form of the object and 
reproducing the object when said determina- 
tion means determines that the object having 
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the predetermined attribute exists. 

2. The apparatus according to claim 1 , characterised 
in that the digital data stream includes a data stream 
coded by a digital television MPEG4 scheme, and 
has main data and sub-data, 

wherein the main data includes data having a 
plurality of objects divided in units of predetermined 
objects, and the sub-data includes attribute infor- 
mation of the objects. 

3. The apparatus according to claim 1 or 2, character- 
ised in that the digital data stream includes a digital 
television data stream containing an object coded 
by an MPEG4 scheme and the sub-data multi- 
plexed on an MPEG2 bitstream. 

4. The apparatus according to claim 1 , characterised 
in lhal said reproducing means does not display the 
object in reproducing the object having the prede- 
termined attribute. 

5. The apparatus according to any one of claims 1 -4, 
characterised in that the predetermined attribute in- 
cludes a real-time information attribute which is sig- 
nificant in recording the digital data stream. 

6. The apparatus according to any one of claims 1 -5, 
characterised in that the apparatus further compris- 
es timepiece means (56, 67) for measuring current 
time, and 

said reproducing means replaces the object 
by another display based on time measurement by 
said timepiece means in reproducing the object 
having the predetermined attribute. 

7. A display device for receiving and displaying digital 
data reproduced by a reproducing device, charac- 
terised by comprising: 

determination means (51) for determining 
whether an object having a predetermined at- 
tribute exists in the digital data; and 
display control means (29) for changing a dis- 
play form of the object and displaying the object 
when said determination means determines 
that the object having the predetermined at- 
tribute exists. 

8. The device according to claim 7, characterised in 
that said display control means does not display the 
object in reproducing the object having the prede- 
termined attribute. 

9. The device according to claim 7 or 8, characterised 
in that the predetermined attribute includes a real- 
time information attribute which is significant in re- 
cording the digital data. 



10. The device according to any one of claims 7-9, char- 
acterised in that the device further comprises time- 
piece means (56) or measuring current time, and 

said display control means (29) replaces the 
5 object by another display based on time measure- 
ment by said timepiece means in reproducing the 
object having the predetermined attribute. 

11. An image processing method of reproducing a re- 
10 corded digital data stream, characterised by com- 
prising: 

the determination step (S24) of determining 
whether an object having a predetermined at- 
15 tribute exists in the recorded digital data 

stream; and 

the reproducing step (S28-S31) of changing a 
reproducing form of the object and reproducing 
the object when the object having the predeter- 
20 mined attribute is determined to exist in the de- 

termination step, 

12. The method according to claim 11 , characterised in 
that the digital data stream includes a data stream 

25 coded by a digital television MPEG4 scheme, and 
has main data and sub-data, 

the main data includes data having a plurality' 
of objects divided in units of predetermined ob- 
30 jects, and 

the sub-data includes attribute information of 
the object. 

13. The method according to claim 11 or 12, character-' 
35 jsed in that the digital data stream includes a digital 

television data stream containing an object coded 
by an MPEG4 scheme and the sub-data multi- 
plexed on an MPEG2 bitstream. 

40 14. The method according to claim 1 1 , wherein said re- 
producing step comprises not displaying the object 
in reproducing the object having the predetermined 
attribute. 

45 is. The method according to any one of claims 11-14, 
characterised in that the predetermined attribute in- 
cludes a real-time information attribute which is sig- 
nificant in recording the digital data stream. 

50 16. The method according to any one of claims 11-15, 
characterised in that the method further comprises 
the timepiece step of measuring current time, and 
said reproducing step comprises replacing 
the object by another display based on time meas- 

55 urement in the timepiece step in reproducing the ob- 
ject having the predetermined attribute. 

17. An image processing apparatus for reproducing a 
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recorded digital data stream, characterised by com- 
prising: 

determination means (66) for determining 
whether an object having a predetermined at- 5 
tribute exists in the recorded digital data 
stream; 

designation means (40) for designating a repro- 
ducing form of the object having the predeter- 
mined attribute from a plurality of reproducing 10 
forms; and 

reproducing control means (39) for reproducing 
an image corresponding to the object having 
the predetermined attribute in the reproducing 
form designated by said designation means is 
when said determination means determines 
that the object having the predetermined at- 
tribute exists. 



1 8. The apparatus according to claim 1 7, characterised 20 
in that the reproducing form designated by said des- 
ignation means includes reproducing by using an 
icon corresponding to the predetermined attribute. 

1 9. The apparatus according to claim 1 7, characterised 25 
in that the reproducing form designated by said des- 
ignation means further includes reproducing by us- 
ing an audio object. 

20. The apparatus according to claim 1 7, characterised 30 
in that the reproducing form designated by said des- 
ignation means includes non-display of the object 
having the predetermined attribute. 

21 . The apparatus according to claim 1 7, characterised 35 
in that the reproducing form designated by said des- 
ignation means includes display of time information 
obtained in recording the digital data stream. 

22. The apparatus according to any one of claims 40 
17-21 , characterised in that the predetermined at- 
tribute includes an emergency news telop. 

23. The apparatus according to any one of claims 

1 7-22, characterised in that said designation means 45 
comprises: 



24. The apparatus according to any one of claims 

1 7-23, characterised in that the digital data stream 55 
includes a data stream coded by a digital television 
MPEG4 scheme, and has main data and sub-data, 



the main data includes data having a plurality 
of objects divided in units of predetermined ob- 
jects, and 

the sub-data includes attribute information of 
the object. 

25. The apparatus according to claim 24, characterised 
in that the digital data stream includes a digital tel- 
evision data stream containing an object coded by 
an MPEG4 scheme and the sub-data multiplexed 
on an MPEG2 bitstream. 

26. The apparatus according to any one of claims 
17-25, characterised in that designation by said 
designation means can be executed during repro- 
ducing of the object having the predetermined at- 
tribute. 

27. An image processing method in an image process- 
ing apparatus for reproducing a recorded digital da- 
ta stream, characterised by comprising: 

the determination step (S104) of determining 
whether an object having a predetermined at- 
tribute exists in the recorded digital data 
stream; 

the designation step (S1 05) of designating a re- 
producing form of the object having the prede- 
termined attribute from a plurality of reproduc- 
ing forms; and 

the reproducing control step (S1 07-S1 1 0) of re- 
producing an image corresponding to the ob- 
ject having the predetermined attribute in the 
reproducing form designated in the designation 
step when the object having the predetermined 
attribute is determined to exist in the determi- 
nation step. 

28. The method according to claim 27, characterised in 
that the reproducing form designated in the desig- 
nation step includes reproducing by using an icon 
corresponding to the predetermined attribute. 

29. The method according to claim 27, characterised in 
that the reproducing form designated in the desig- 
nation step further includes reproducing by using an 
audio object. 

30. The method according to claim 27, characterised in 
that the reproducing form designated in the desig- 
nation step includes non-display of the object hav- 
ing the predetermined attribute. 

31 . The method according to claim 27, characterised in 
that the reproducing form designated in the desig- 
nation step includes display of time information ob- 
tained in recording the digital data stream. 



instruction means for instructing a reproduced 
icon on a display screen; and 

means for changing a set value for designating so 
the reproducing form in accordance with in- 
struction operation of said instruction means. 



MSDOOin* <FP 11?,d^7QA? I 



24 



47 



EP 1 124 379 A2 



48 



32. The method according to any one of claims 27-31 , 
characterised in that the predetermined attribute in- 
cludes an emergency news telop. 

33. The method according to any one of claims 27-32, 
characterised in that the designation step compris- 
es: 

the instruction step of instructing a reproduced 
icon on a display screen; and 
the step of changing a set value for designating 
the reproducing form in accordance with in- 
struction operation in the instruction step. 

34. The method according to any one of claim 27-33, 
characterised in that the digital data stream in- 
cludes a data stream coded by a digital television 
MPEG4 scheme, and has main data and sub-data, 

the main data includes data having a plurality 
of objects divided in units of predetermined ob- 
jects, and 

the sub-data includes attribute information of 
the object. 

35. The method according to claim 34, characterised in 
that the digital data stream includes a digital televi- 
sion data stream containing an object coded by an 
MPEG4 scheme and the sub-data multiplexed on 
an MPEG2 bitstream. 



tribute exists in the recorded digital data 
stream; 

a designation step module for designating a re- 
producing form of the object having the prede- 
5 termined attribute from a plurality of reproduc- 

ing forms; and 

a reproducing control step module for repro- 
ducing an image corresponding to the object 
having the predetermined attribute in the repro- 
1 ° ducing form designated in the designation step 

module when the object having the predeter- 
mined attribute is determined to exist in the de- 
termination step module. 

15 



20 



25 



36. The method according to claim 34, characterised in 
that designation in the designation step can be ex- 
ecuted during reproducing of the object having the 
predetermined attribute. 35 



37. A computer-readable recording medium which 
stores a program of executing an image processing 
method of reproducing a recorded digital data 
stream, characterised by comprising: 40 



a determination step module for determining 
whether an object having a predetermined at- 
tribute exists in the recorded digital data 
stream; and 45 
a reproducing module for changing a reproduc- 
ing form of the object and reproducing the ob- 
ject when the object having the predetermined 
attribute is determined to exist in the determi- 
nation step module. so 



38. A computer-readable recording medium which 
stores a program of executing an image processing 
method of reproducing and displaying a recorded 
digital data stream, characterised by comprising: 55 



a determination step module for determining 
whether an object having a predetermined at- 
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